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In the UNITED STATES PATENT and TRADEMARK OFFICE 
APPLICATION OF M. ROYER. D. W. GABRIEL. R. FRUTOS AND P. ROTT 

COMPLETE BIOSYNTHETIC GENE SET FOR SYNTHESIS OF POLYKETIDE 

ANTIBIOTICS,INCLUDINGTHEALBICIDINFAMILY,RESISTANCEGENES AND 
USES THEREOF ' 

TECHNirAI.Plgip 

The invention is in the field of genetic engineering, and in particular the isolation and 
expression of the biosymhetic genes that produce a femily of antibiotics known generically as 
albicidins. 

BACKGROUND OF THF INVENTTON 

U.S. Patent No. 4.525,354 to Birch and Patil described a "non-peptide" antibiotic of 
"about 842" caUed "albicidin". AlWcidin is described as produced by culturing 
chlorosis-inducing strains of Xanthomonas albUmeans isolated from diseased sugarcane, and 
mutants thereof. The antibiotic was isolated from the culture medium by adsoiption on lesin and 
was purified by gel filtration and High Perfomance Liquid Chromatography (HPLC). The 
chemical structure of this antibiotic was not detemiined and remained unknown, although the 
Birch and PatQ patent disclosed spectral data for a fraction having antibiotic activity and the 
presence of approximately 38 carbon atoms and at least one COOH group. The present invention 
describes and characterizes the femily of antibiotics that is produced by culturing chlorosis- 
inducing strains of ^ albilineans and mutants theieot together with the complete set of twenty 
biosynthetic genes capable of producing the unique and previously uncharacterized family of 
antibiotics produced byX. albilineans and previously lumped together as "albicidins". The set 
of twenty biosynthetic genes isolated, purified and cloned from a culture of Jf. albilineans 
revealed that this set of biosynthetic genes is capable of synthesizing products exhibiting a high 
level of variation among the products, indicatmg that albicidins comprise a femily of polyketide 
antibiotics. The albicidins described in the present invention are synthesized by twenty genes, 
including one polyketide-peptide synthase, one polyketide synthase and two peptide synthases,' 
but the substrates of the polyketide-peptide synAase and of one peptide synthase are not a-amino 
acids. The biosynthetic enzymes represent a previously undescribed and unique polyketide 
antibiotic biosynthetic system. 
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Xanihomonas albilineans is a systemic, xylem-invading pathogen that causes leaf scald 
disease of sugarcane (interspecific hybrids of Saccharum species) (Ricaud and Ryan, 1989 ; Rott 
and Davis, 2000). Leaf scald symptoms inchide chlorosis, necrosis, rapid wUting, and plant 
death. Chlorosis-inducing strains of the pathogen produce several toxic compounds. The major 
toxic component, named albicidin, inhibits chloroplast DNA replication, resulting in blocked 
chloroplast diflferentiation and chlorotic leaf streaks that are characteristic of the plant disease 
(Birch andPatil, 1983. 1985b. 1987aand 1987b). Several studies estabUshed that albicidin plays 
a key role in pathogenesis and especially in the development of disease symptoms (Wall and 
Birch, 1997; Zhang and Birch. 1997; Zhang et al, 1999; Biroh, 2001). 

The prior art indicates that albicidin inhibits prokaryotic DNA rephcation and is 
bactericidal to a range of gram-positive and gram-negarive bacteria (Birch and Patil, 1985a). 
Albicidin is therefore of interest as a potential clinical antibiotic (Birch and Patil, 1985a). 
However, low yield of toxin production in X. albilineans has slowed down studies into the 
chemical stracture of albicidin and its therapeutic application (Zhang e/ a/., 1998). The chemical 
structure of this albicidin remains unknown, however this albicidm has been partially 
characterized as a non-pqptide antibiotic with a molecular weight of about 842 that contains 
approximately 38 carbon atoms with three or four aromatic rings, at least one COOH group, two 
0CH3 groups, a trisubstituted double bond and a CN linkage ^irch and Padl, 1985a; Huang et 
a/., 2001). 

Molecular cloning and characterization of the genes governing the biosynthesis of 
albicidin is of considerable interest because such information indicates approaches to engineer 
overproduction of albicidin, to characterize its chemical structure, to allow therapeutical 
applications and to clarify the relationship betwerai toxin production and the ability to colonize 
sugarcane. Two similar mutagenesis and complementation studies have been conducted to 
identify the genetic basis of albicidm production iaX. albilineans strains isolated in two different 
geogrq>hical locations, Australia and Florida. 

QnssXxx&yotX. albilineans stiainlSlSS from Australia revealed that genes foralbicidin 
biosynthesis and resistance span at least 69kb (Wall and Birch, 1 997). Subsequently, three genes 
required for albicidin biosynthesis were identified, cloned and sequenced fiom two Australian 
strains ofX. albilineans (LS155 and Xal3): xabA, xabB and xabC (Huang et al., 2001; Huang 
et al. 2000a, 2000b). Th&xabB gene encodes a large protein with a predicted size of 525.6 kDa, 
with a modular architecture indicative of a multi functional poljicetide synthase (PKS) linked to 
a nonribosomal peptide synthetase (NRPS) (Huang et al., 2001). The xabC gene, located 
munediately downstream firornxofrJ, encodes an S-adenosyl-L-methionine (SAM)-dependent O- 
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methyltansferase (Huang et al., 2000a). He xabA gene, located in another region of the 
genome, encodes a phosphopantetheinyl transferase required for post-tianslational activation of 
PKS and NRPS enzymes (Huang et al., 2000b). 

lUese first results demonstrated that the albicidin biosynthesis apparatus is a PKS and/or 
NRPS system. Such systems assemble single acyl-coenzyme A or amino acid monomers to 
producepolyketldes and/ornonribosomal peptides (Marahiel etaL, 1997; Cane 1997- Cane and 
Walsh. 1999). These metaboUtes form very large classes of natural products thLt include 
nmnerous important pharmaceuticals, agrochemicals. and veterinary agents such as antibiotics 
umnunosuppressants. anti-cholesterolemics. as well as antitumor, antifimgal and antiparasitic 
agents. Genetic studies of prokaryotic PKS and NRPS produced detailed mformation regarding 
the function and the organization of genes responsible for the biosynthesis of polyketides and 
nomibosomal peptides. Such knowledge, in turn, made it possible to produce combinations of 
PKS and NRPS genes from diflFerent microorganisms in order to produce novel antibiotics 
(McDamel et al., 1999; Rodriguez and McDaniel. 2001; Pfeifer et al., 2001). Investigating the 
complete albicidin biosynthesis apparatus is therefore of great interest because such results may 
contribute to the knowledge as to how PKS and NRPS interact and how they might be 
manipulated to engineer novel molecules. 

A second study with X. albilineans strain Xa23Rl from Florida revealed that at least two 
gene clusters, one spamiing moie than 48 kb. are involved in albicidin production (Rott et al 
1996). This conclusion was based on the following data: (I) fifty Xa23Rl mutants defective in 
albicidm production were isoUted; (ii) a Xa23Rl genomic library of 845 clones, designated 
pALBl to PALB845. was constructed; (ui) two overlapping DNA inserts of approximately 47 
kb and 41 kb. from clones pALB540 and pALB571 respectively, complemented forty-five 
mutants and were supposed to contain a major gene cluster involved in albicidin production- (iv) 
a DNA msert of approximately 36 kb, from clone pALB639. complemented four of the' five 

remammg mutants not complemented bypALB540 and PALB571. and was supposed to contain 
a second region involved in albicidin production; and (v) the remaining mutant. AM37. which 
was not complemented by any of the three cosmids pALB540, pALB571 and pALB639 was 
supposed to be mutated in a third region of the genome involved in albicidin production ' 

The DNA sequences of all of the genes required to produce the albicidin family of 
polyketide antibiotics, llie expressed protein amino acid sequences of all of the genes and the 
deduced structure of Albicidin have not been previously reported, although fragmentary 
sequences that inch«le three of the biosynthetic genes have been reported. Identification of one 
albicidin gene. xabC, as a methyltransferase gene involved in albicidin biosynthesis is reported 
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by Huang. G.. Zhang. L. & Birch. R.G. (2000a. Gene 255. 327-333) and claimed as biologically 
activeinproducingapolyketideantibioticinPCTWO02/24736Al. Identification of a second 
albicidrngencacaM. as aphosphopantetheinyl transferase gene is reported by Huang, G.. Zhang 
L. andBiich. R.G. (2000b) Gene 258. 193-199 and clain,ed as biologically active in producing 
a polyketjde antibiotic in PCX WO 02/24736 Al. Huang. G.. Zhang. L. & Birch. R.G. (2001) 

M.crobiologyl47.63I.642.t«portaDNAsequenceof«M(GenBankaccession#AF239749) 
a inute functional polyketide-peptide synthetase that nmy be essential 

m Xanthomoms albilmeam. This xabB gene is reported as fidl lojgth by Birch in PCT WO 
02/24736 Al (their seq. ID#l)andclaimedbyBirchinPCTWO02/24736Al as a biologically 
active polyketide synthase of 4.801 amino acids in length, enabling production of albicidin 
However, flie DNA sequence reported by Huang et al (2001) in GenBank AF239749 and by 
B«.h m PCT WO 02/24736 Al (their seq. ID #1) appears to be incomplete and missing 6 234 
bp of DNA sequence encoding 2.078 amino acids. We claim the complete DNA sequence of 
^bB ialbl, our seq. 20) as 20.637 bp. encoding a biologically active polyketide synthase of 
6.879 ammo acids of in this appHcation (our seq ID #26). Factors affecting biosynthesis by 
Aa«r/,omo«asa/6./i«^ of albicidins antibiotics and phytotoxins are discussed in J Appl 

]^«obioL85.1023-1028.andWall.M.K.&Birch,R.G.(1997). Genes foralbicidinbiosynthS 
and resistance span at least 69 kb in the genome of Xanthomonas albilineans. Utt Appl 

Mrcrobiol. 24, 256-260. A gene fiom^ a/Mih^on. strain Xal3, designed AlbF. which confers 
high level albicidin resistance in Escherichia coli and which encodes a putative albicidin efflux 
pmnp, was directty submitted to Genbank by Bostock and Birch (Accession n* AF403709). 

SUMMARY OF THF. TINTVlTiv-qr^ivr 

The invention provides a novel antibiotic family, Albicidins, produced by three novel 
biosynthetic gene clusters (XALBl, XALB2. and XALB3) contained within a host cell DNA m 
which one strand comprises non-contiguously SEQ. ID No. 1. SEQ. ID No. 2 and SEQ ID No 
3. and the cell expresses the DNA to provide peptides including tiiose named Albl (SEQ ID No' 
26 ) (encoded by SEQ ID No. 20). Albfl (SEQ ID No. 27 ) (encoded by SEQ ID No 21 ) AlbUI 
(SEQ ID No. 28 ) (encoded by SEQ ID No. 22 ), AlblV (SEQ ID No. 29 ) (encoded by SEQ ID 
No. 23). AlbVI (SEQ ID No. 31) (encoded by SEQ ID No. 18). AlbVII (SEQ ID No 32 ) 
(encoded by SEQ ID No. 17). AlbVm(SEQIDNo. 33 ) (encoded by SEQ ID No 16) AlbK 
(SEQ ID No. 34)(encodedbySBQIDNo. 15), AlbX (SEQ ID No. 35 ) (encoded by SEQ ID 
No. 10),AIbXI(SEQIDNo. 36 ) (encoded by SEQ ID No. 9 ). AlbXU (SEQ ID No 37) 
(encoded by SEQ ID No. 8). AlbXm (SEQ ID No. 38) (encoded by SEQ ID No 7) 
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AlbXIV(SEQIDNo. 39) (encoded by SEQ ID No. 6 ). AlbXV (SEQ ID No. 40) (encoded by 
SEQ ID No. 5 ), AlbXVn (SEQ ID No. 42) (encoded by SEQ ID No. 1 1 ), AlbXVm (SEQ ID 
No. 43 ) (encoded by SEQ ID No. 12), AlbXDC (SEQ ID No. 44 ) (encoded by SEQ ID No. 
13) , AlbXX (SEQ ID No. 45 ) (encoded by SEQ ID No. 14 ), AlbXXI (SEQ ID No. 46) 
(encoded by SEQ ID No. 24). and AlbXXn (SEQ ID No. 47 ) (encoded by SEQ ID No. 25), that 
in turn interact within tbe host ceU to produce one or more antibiotics as more folly illustrated 
in Figure 11. In one embodimait the invention comprises a plurality of isolated and purified 
DNA strands which comprise nucleotide sequences of the group consisting of SEQ ID No: 1 to 
SEQ. ID No. 25, each individual sequence, except the transposases AlbV (SEQ ID No. 30 ) 
(encoded by SEQ ID No. 19 ) and AlbXVI (SEQ ID No. 41 ) (encoded by SEQ ID No. 4 ) 
found in the XALBl cluster, being necessary to the biosynthesis of the novel family of 
antibiotics, AlWcidins. The invention also includes the peptides or proteins encoded by the 
genes of the biosynflietic complex expressed by the combination of DNA with a strand having 
sequences SEQ ID Nos. 1 to 3. Proteins are named with roman numerals and the prefix Alb from 
AIM to Alb XXn have the amino acid sequences of SEQ ID Nos. 26 to 47 (not in Roman numeral 
order but in the order of placement of the genes within sequences SEQ ID Nos. 1 to 3 that express 
each protein). Expression of the peptides having the amino acid sequences of SEQ ID Nos. 26 
to 29, 31 to 40 and 42 to 47, have been found to be all required for the successfiil biosynthesis 
of Albicidins. The invention provides a method for producing Albicidins comprising providing 
a modified host cell with a heterologous DNA Albicidin Biosynthetic Gene Cluster or set of 
genes defined as DNA operably comprising DNA sequences substantially similar to SEQ ID 
Nos. 1 to 3. SubstantiaUythesamemeansDNAhavingsufiBcienthomologytoprovideexpressed 
proteins that fimction to provide an antibiotic material having the structural components 
identified herein. Preferably a given sequence will have at least 70 percent homology to one of 
SEQ ID Nos. 1 to 3, preferabfy 85% homology and most preferably at least 95% homology. The 
method inchides the steps consisting of, modifying the DNA of the host cell to comprise an 
operable expression system for maintaining the modified host ceU under conditions supporting 
biosynthesis of Albicidins and isolation of Albicidins &om the host ceU or its environment. The 
invention further provides a method of production of a group of novel antibiotic materials 
utilizing at least three of the Sequences selected from the group consisting of DNA SEQ ID No. 
1 to SEQ ID No. 25 (excluding transposases encoded by SEQ IDs No. 4 and 19) inclusive in 
combination with additional sequences to produce a modified Albicidin- like material. 

More specifically, the invention provides DNA Sequences comprising at least about 
68,498 base pairs more or less and inchiding an about 55,839 bp region fi-om the genome ofX. 



Application of Royer, et al 



a/Ml«ea«designatedasXALBl (SEQID.No. Dandadditionalnoncontiguous regions having 
about 2.986 bp, XALB2(SEQ ID. No. 2). and about 9.673 bp. XALB3 (SEQ ID. No. 3). These 
sequences were found to be required for biosynthesis of Albicidins. Homology analysis revealed 
the presence of (i) four large genes with a modular architecture characteristic of polyketide 
synthases (PKS) andnonribosomal peptide synthetases (NRPS) potentially involved in aibicidin 
precursor biosynthesis; (ii) four smaller genes potentially involved in aibicidin substrate 
biosynthesis (ill) four modi^ genes; (iv) one enzyme activating gene, (v) two regulatory 
genes, (vi) one chaperone gene, (vu) two genes of miknown function; and (viii) two resistance 
genes. These are named and discussed more folly below. Together these genes aUow the 
successfol operationof the biosynthetic pathway when cloned into suitable host cells. Alignment 
of individual NRPS and PKS domains revealed an extraordinary biosynthetic apparatus beUe ved 
to involve a trans^on of separate PKS and NRPS domains which could contribute to the 
production of multiple, structurally related albicidins by the same gene cluster. Furthermore, 
analysis of selectivity-confeiring residues indicated that four NRPS modules of XALB 1 specify 
an unusual substrate. Through the intei;actionofthese genes the following methods are enabled: 
a) In an alternate embodiment the invention provides a method of producing a polyketide 
canying para-aminobenzoic acid and/or carbamoyl benzoic acid by inserting at least one DNA 
Fragment that encodes a PKS protein into a cell and causing the cell to express the encoded PKS 
protein under conditions such that the PKS protein functions to produce a polyketide carrying 
either a para-aminobenzoic acid or a carbamoyl benzoic acid or both, b) Another alternate 
embodiment is a method of producing polyketide/^eptides carrying para-aminobenzoic acid 
and/or carbamoyl benzoic acid by inserting at least one DNA Fragment that encodes a PKS 
protein into a ceU and causing the cell to express the encoded PKS protein under conditions such 
that the PKS protein functions to produce a polyketide carrying either a para-aminobenzoic acid 
or a carbamoyl benzoic acid or both. Or c) In another alternate embodiment the invention 
provides amethod of activating noiq>roteinogenic amino acids likeparaminobenzoic acid and/or 
carbamoyl benzoic acid for incorporation into peptides or polyketides by inserting at least one 
DNAFragment that encodes a PKS protein into a cell and causing the cell to express the encoded 
PKS protein under conditions such that the PKS protein functions to produce a polyketide 
carrying either a para-aminobenzoic acid or a carbamoyl benzoic acid or both. 

There are three regions of the X. albilimans genome specifying aibicidin production. 
TheXALB2 andXALBS regions each contain only one gene, both of which are required for post- 
translational activation and folding of albicidinPKS and NRPS enzymes. TheXALBl. XALB2 
and XALB3 gene clusters are characterized by an unusual hybrid NRPS-PKS system, indicating 
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that albicidin biosynttiesis may provide an excellent model for investigating the biosynthesis of 
hybrid polyketide-polypeptide metabolites in bacteria. The availability of three genomic regions 
involved in albicidin prodnction, XALBl and XALB2 and XALB3, also ofTers the ability to 
express individually the enzymes of the albicidin family biosynthetic pathway including 
structural, resistance, secretory and regulatory elements, and to engineer overproduction of 
albicidin in mutated or modified host cells of the invention. The invention overcomes prior art 
limitations in albicidin production due to low yields of toxin production in albilinecms and may 
also allow characterization of the chemical structure of albicidin as well as application of this 
potent inhibitor of procaryote DNA replication. 

The invention results from a number of unpredictable results namely the number and 
complexity of the enzymes involved in biosynthesis. The discovery of the complete sequence 
required for biosynthesis of Albicidins is previously unreported. The invention provides for a 
novel process for production of molecule having a polyketide-polypeptide backbone and the 
formula C4oH350i5Ng, a molecular weight of 839, and the structural elements shown in Figure 1 1 . 
The invention further includes (a) the Albicidin Family Biosynthetic Gene Cluster including (b) 
the structural and regulatory elements of the operons that encode c) the enzymes PKS- 1 , PKS-2, 
PKS-3, FKS-4, NRPS-1, NRPS-2, NRPS-3, NRPS-4, NRPS-5, NRPS-6 and NRPS-7 as well as 
(e) the proteins Albl to AlbXXII, (f) the isolated enzymes, proteins, and active forms thereof, 
as well as mutants, fragments, and fusion proteins comprising any of the forgoing; (g) the uses 
of the enzymes or proteins encoded by the Albicidins Biosynthesis Gene Cluster or any one of 
its operons, (h) a host cell expressing one or more enzymes or proteins encoded by the Albicidin 
Family Biosynthetic Gene Cluster; (i) use of host cells having the Albicidins Biosynthesis Gene 
Cluster to produce an antibiotic; (j) methods of modifying the DNA sequences to produce 
members of a series of antibiotic compounds having structures related to Albicidins; (k) DNA 
sequences that encode the same proteins as any of SEQ. ID. Nos. 1 to 25 but differ in specific 
codons due to the multiplicity of codons that lead to expression of the same amino acid, (1) 
antibiotics produced by the process of expression of the Albicidin Family Biosynthetic Genes 
in a genetically modified host cell sustained m a culture medium and thereafter separation of the 
antibiotic from the host cell and culture medium, (m) an isolated and purified antibiotic produced 
by a process that includes at least three proteins coded by DNA sequences selected for the group 
consisting of SEQ, ID Nos. 1 to 25 in combination with additional enzymes that modify the 
product to provide a non-naturally occurring Albicidins like product having at least one of the 
useful properties reported for albicidin and (n) a process for producing an antibiotic that 
comprises modifying a host cell to enhance expression of the DNA of the Albicidm Family 
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Biosynthetic Gene Cluster by insertion of expression enhancing DNA into the genome of a 
Xanthomonas albilineans strain in a position operative to enhance expression of the enzymes of 
the Albicidin Family Biosynthetic Gene Cluster, culturing the modified host cell to produce an 
antibiotic and isolating the antibiotic. The products and methods described above have utility as 
5 proteins or as nucleic acids as the case may be, including such uses sources of pyrimidine or 

purine bases or amino acids, or as animal food supplements and the like, as well as the more 
important uses to provide antibiotics, plant disease treatment methods, genetically modified 
disease resistant plants, phytotoxins and the like. 

10 BRIEF DESCRIPTION OF THR nw AWINGS 

Figure 1 is a Physical Map and genetic organization of the DNA Region containing the major 
gene cluster XALBl involved in the biosynthesis of Albicidins. 

Figure 2 is an illustration of flie organization of tiie four PKS modules and the seven NRPS 
modules identified in cluster XALBI and comparison with the organization of the prior art 
IS material XabB. 

Figure 3 shows the conserved sequence motifs in 0~methyltransferases and C-methyltransferases 
involved in antibiotic biosynthesis in bacteria and in AlbH 

Figure 4 shows the conserved sequence motifs in O-methyhransferases and in different tcmP-like 
hypothetical proteins and AlbVL 
20 Figure S is an illustration of the alignment of the primary sequences between the conserved 
motifs A4 and AS of Alb NPRSs and PKS-4 mXanthomonas albilineans with the corresponding 
sequences of GrsA (Phe) accession number:P14687 and Blm NRPS-2 (p-Ala) accession number 
AF210249. 

Figure 6 shows Rho-independent transcription terminators identified in the intergenic regions of 
2S XALBl and XALB3 clusters. 

Figure 7A shows sequences identified as a putative bidirectional promoter between albX and 
albXVn in XALBl for transcriptional control of operons 3 and 4. 

Figure 7B shows sequences identified as a putative unidirectional promoter upstream firom 
albXIX for transcriptional control of operon 5 if albXVlII is not expressed, 
30 Figure 8 is a physical map and genetic organization of the DNA region contaming the gene 

clusters XALB2 and XALB3 involved in albicidin production. 

Figure 9A is linear model 1 leading to the biosynthesis of only one polyketide-polypeptide 
albicidin backbone. 

Figure 9B is linear model 2 leading to the biosynthesis of four different polyketide-polypeptide 
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backbone. 

Figure lOA is an alignment of the conserved motifs in AT domains from RifA-1 ,-2,-3, RifB-1 , 
RifE-l (Rifamycin PKSs, August et al, 1998) and BlmVm (Bleomycin PKS; Du et o/.,2000). 
Figure lOB is a comparison of AlbXIU, FenF (a malonyl-CoA transacylase located upstream 
from OTya4, Duitman et al, 1999) and LipA (a lipase; Valdez et ai, 1999). 
Figure 11 A is a proposed model for biosynthesis of albicidin, including putative substrates of 
PKS and NRPS modules. 

Figure 1 IB shows the proposed compositions and structures of albicidins. 
DETAILED DESCRIPTION OF THE INVENTION 

The invention results from the DNA sequencing of the complete major gene chjster 
XALBl, as well as tiie noncontiguous fragments XALB2 and XALB3. XALBl is present in 
the two overlapping DNA inserts of clones pALB540 and pALB571 . Reading frame analysis and 
homology analyses allow one to predict the genetic organization of XALBl and to assign a 
function to the genes potentially required for albicidin production. Based on the alignment of 
the different PKS and/or NRPS enzymes encoded by XALBl we proposed a model for the 
albicidin backbone biosynthesis. However the invention disclosed herein does not depend upon 
the accuracy of the proposed model. The invention includes the successful cloning and DNA 
sequencing of the second region of the genome (XALB2) involved in albicidin production and 
mutated in mutant AM37. 

The invention includes the characterization of the third region of the genome (XALB3) 
involved in albicidin production present in clone pALB639. These results allowed the possibility 
to characterize all enzymes of the albicidin biosynthesis pathway including structural, resistance 
and regulatory elements and to engineer overproduction of albicidin. 

EXAMPLE 1: Materials and methods 

Bacterial strains and plasmids. The source of bacterial strains and their relevant characteristics 
are described in Table L 



Media, antibiotics, and culture conditions. X. albilineam strains were routinely cultured on 
modified Wilbrink's (MW) medium at 30*C without benomyl (Rett et al., 1994). For long-term 
storage, highly turbid distilled water suspensions of AT. albilineam were supplemented with 
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glycerol to 15% (voWol) and frozen at -80"C. For X albilineans, MW medium was 
supplemented with the following antibiotics as required at the concentrations indicated: 
kanamycin, 10 or 25 jig/ml; and rifempicin, 50 iig/ml. E, colt strains were grown on Luria- 
Bertani (LB) agar or in LB broth at SVC and were maintained and stored according to standard 
protocols (Sambrook et al., 1989). For^.' co/i, LB medium was supplemented with the following 
antibiotics as required at the concentrations indicated: kanamycin, 50 |xg/ml; ampicillin, 50 
|ig/ml. 

Bacterial conjugatioiuDNA transfer between^, co/i donor (DH5{xMCR/pAlb3 89 or pAC389. 1 , 
Table 1) and rifampicin-resistant JT, albilineans recipients {X. strains AMIO, AM12, AM13, 
AM36 and AM37, Table 1) was accomplished by triparental conjugation with plasmid pRK2073 
as the helper as described previously (Rott et aL, 1996). 



Table 1 : Bacterial strains and plasmids used in this study 





Relevant eharacteristics° 


Reference or source 


Strains 






E, colt 






DHSa 


F-/80d/acZAM15 Hlac7YArargV)m69 deoK recA 1 endA 1 
hsdRl 7(t^ r\*) supE44 thi-l gyrA9S reUX 


Gibco-BRL 


DHSaMCR 


DH5a mcrA mcrBC mrr 


It 


X. albilineans 






Xa23 


Wild type from sugarcane (Florida) 


Rotte/ oL, 1996 


Xa23Rl 


Spontaneous Rif derivative of Xa23 


It 


IS AM strains 


Xa23Rl vrCnS'gusA, Km', Rif, Tox" 


II 








Plasmids 






PBR325 




Gibco-BRL 


pBCKS(+) 




Stratagene 


pBluescript II 
KS(+) 


Ap' 


n 


PRK2073 


PRK2013 derivative, Kiri* (/i/?/::Tn7), Sp', Tra*, helper plasmid 


Leonge/a/, 1982 


PUFR043 


IncW Mob* LacZo. Gm', Km*, Cos 


De Fcytcr and Gabriel, 
1991 


pAlb540 


47 kb insert from Xa23Rl in pUFR043. Gm\ YM 


Rotte/fl/., 1996 


pAlb571 


36.8 kb insert from Xa23Rl in pUFR043, Gm', Km' 


It 


PAlb639 


36 kb insert jfrom Xa23Rl in pUFR043, Cm', Km' 


N 
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nAMI < 1 


Acioxv 1 nagniciii cairying i iw anu iiaiiKing ScquGiiCcS oi 
mutant AMIS in oBR32S Km' Tc' An' Cm' 


ff 


pAM40^ 


1 1 kb £lcoR. I fragment carrvinff TnS and flanlcin? seouenceci t%f 
mutant AM40 in pBR325, Km', Tc^ Ap', Cm' 


ti 


pAM45.1 


12 kb EcotL I fiaement carrvins Tn5 and flankinir semience« of 
mutant AM45 in pBR325i Km', T<f, Ap', Cm' 


II 


pAM12.1 


13 kb £coR. I firasment cairvinff Xn5 and flanlmri? oemiences e%f 
mutant AM12 in pBR32S, Km', Tc', Ap', Cm' 




PAM36.2 


9 kbi^coR I fragment canying Tn5 and flanking sequences of 
mutant AM36 in pBR325, Km', Tc', Ap'. Cm' 


(• 


pAlb389 


37 kb insert from Xa23Rl in pUFR043, Gm', Km' 


This study 


pAC389.1 


2.9 kb insert from Xa23Rl in pUFR043, Gm', Km' 


n 


PAlb639A 


9.4 kb insert from Xa23Rl in pUFR043, Gm', Km' 


n 


PEV639 


2.6 kb Sal I insert from Xa23Rl in pUFR043, Gm', Km' 


f* 


pBCYA' 


7.5 kb Kpn I fragment carrying a part of fragment A from 
pAlb571 in pBCKS (+), Cm' 


II 


pBCyAF 


1 5.2 kb J^coR I fragment carrying fragments A and F from 
pALB540 in pBCKS (+), Cm' 


n 


pBCm 


1 1. OkbA^n I fragment Bfix>mpAlb571 in pBCKS (+), Cm' 


n 


pBcyc 


6.0 kb Kpn I fragment C from pAlb571 in pBCKS (+), Cm' 


It 


pBC/E 


2.8 kb IQjn T fragment E from pAIb571 in pBCKS (+), Cm' 


n 


pBC/F 


2.5 kb A/7/7 1'EcoK I fragment F from pAlb571 in pBCKS (+), 
Cnf 


H 


pBC/G 


1 .9 kb EcoR I fragment G from pAlb571 in pBCKS (+), Cm' 


II 


pBC/I 


1 .4 kb A/?/i I'EcoR I fragment I from pAIb571 in pBCKS (+), 
Cm' 


It 


pBC/J 


0.6 kb EcoK I fragment J from pALB540 in pBCKS (+), Cm' 


It 


pBC/K 


4.7 kb EcoR I fragment K from pALB540 in pBCKS (+). Cm' 


II 


pBCVL 


0.4 kb EcoR 1 fragment L from pALB540 in pBCKS (+), Cm' 


II 


pBC/N 


7.7 kb EcoR I fragment N from pALB540 in pBCKS (+). Cm' 


II 


pUFR043/D* 


2.2 kb EcoR l-^etu^A I fragment carrying a part of fragment D 
frompAlb571 inpUFR043 


it 


pAMl 


5 kb EcoR I fragment carrying Tn5 and flanlcing sequences of 
mutant AM 1 in pBluescript 11 KS (+), Km', Ap' 


11 


pAM4 


12 kb EcoR I fragment carrying Tn5 and flanking sequences of 
mutant AM4 in pBluescript H KS (+), Km', Ap' 


11 


pAM7 


6 kb EcoR I fragment carrying Tn5 and flankmg sequences of 
mutant AM7 in pBluescript 11 KS (+), Km', Ap' 


II 


pAMlO 


7 kb EcoR I fragment canying Tn5 and flankmg sequences of 


ti 
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mutant AMIO in pBIuescript 11 KS (+), Knf , Ap' 




pAM29 


10 kb EcoR 1 fragment carrying TnJ and flanking sequences of 
mutant AM29 in pBIuescript U KS (+), Kirf, Ap' 


M 


pAM37 


6 kb EcoK I fragment canying Tn5 and flankmg sequences of 
mutant AM37 in pBR325, Km^ To', Ap', Cm' 


tl 


pAMS2 


5 kb EcoK 1 fragment carrying Tn5 and flanking sequences of 
mutant AM52 in pBIuescript 11 KS (+), KmT, Ap' 


n 








DNA Fragment 






PR37 


1.1 kb//iiidin-^mdinfrompAM37 


11 



33 * Ap\ Cm', Gm', Km', Rif » Sp', To': resistant to ampicilin, chloramphenicol, gentamycm, kanamycin, rifampicin, 

spectinomycin, tetracycline, respectively. Tox-, deficient m albicidin production. Tn5-gusA^ TnS-uidAI Km' Tc\ 
forms transcriptional fusions. 

Assay of albicidin production. Albicidin production was tested by a microbiological assay as 
40 described previously (Rott et aL, 1996). Rifampicin and kanamycin exconjugants were spotted 
with sterile toothpicks (2-mm-diameter spots) onto plates of SPA medium (2% sucrose, 0.5% 
peptone, 1.5% agar) and incubated at 28*C for 2-5 days. The plates were then overlaid with a 
mixture of ^. coli DH5a (10' cells in 2 ml of distilled water) plus 2 ml of molten 1.5% (wt/vol) 
Noble agar (Difco) at ca. 65'C and examined for inhibition zones after 24 h at 37*C. 

45 

Nucleic acid manipulations. Standard molecular techniques were used to manipulate DNA 
(Sambrook et al., 1989) except for total genomic DNA preparation. Total genomic DNA for 
soudxem blot hybridization was prepared as described by Gabriel and De Feyter (1992). 

50 PCR Conditions. PCR amplifications were performed in an automated thermal cycler PTC- 
1 00™ (MJ Research, Inc). The 25 (il PCR reaction mix consisted of 100 ng of genomic DNA or 
1 ng of plasmid DNA, 2.5 |il of lOX PCR buffer without MgC12 (Eurobio), 80 \iM dNTP mix, 
2.5 units of EUROBIOTAQn® (Eurobio), 25 pmoles of each primer, 2.0 mM MgCl2(Eurobio) 
and sterilized distilled water to final volume. The PCR program was 95 *C for 2 min, 25 cycles 

55 at 94"C for 1 min, Tm for 1 min and 72"C for 1 min, with a fmal 72'*C extension for 5 min. Tm 

temperature was determined for each couple of primers and varied between 55'C and 60*C. A S^il 
aliquot of each amplified product was analyzed by electrophoresis through a 1 % agarose gel. For 
sequencing, PCR products were cloned with the pGEM®-T Easy Vector System (Promega). 
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Oligonadeotide synthesis. Oligonucleotides were purchased from Genome Express (Grenoble 
or Montreuil, France). 

DNA sequencing. Automated DNA sequencing was carried out on double-stranded DNA by the 
5 dideoxynucleotide chain termination (Sanger et al.^ 1977) using a Dye Terminator Cycle 

Sequencing kit and an ABI Peddn-Elmer sequencer according to the manu&cturer's procedure. 
Both DNA strands were sequenced with universal primers or with internal primers (20mers). This 
service was piovided by Genome Express (Grenoble, France). Computer-aided sequence analyses 
were carried out using Sequence Navigator™ (Applied Biosystems, Inc) and SeqMan 
10 (DNASTAR Inc.) programs. 

Sequence analysis. Nucleotide sequences were translated in all six reading frames using EditSeq 
(DNASTAR Inc.). Potential products of ORFs longer than 100 b were compared to protein data 
bases by the PSL-BLAST program (Swiss-Prot and Genbank) on the NCBI site 
15 (http://www.ncbi.nlm,nih.gov/) using Altschul program (Altschul et aly 1997). The 
TERMINATOR program of the Genetics Computer Group was used to identify putative Rho- 
independent transcription terminators. 

Procedures 

20 

EXAMPLE!: ^ Sequencing of the double strand region of 55,839 bp fromAl 
albilineans containing XALBl SEQ ID NO. 1 

In Figure 1 is presented a physical map and genetic organization of XALBl. In the 
figure, E and K are restriction endonuclease sites for EcoRl and Kpnl^ respectively. Rectangular 

25 boxes represent DNA fiagments labeled A ttirough N. The nimibers below each rectangular box 
are the number of TnJ-giij insertion sites previously located in each DNA fragment (Rott et aL, 
1996). The DNA inserts carried by plasmids pALBS71 and pALBS40 are represented by bold 
bars above the physical map. The location and direction of putative orfs identified in the XALB 1 
gene cluster are shown by arrows. Precise positions and proposed functions for individual orfs 

30 are summarized in tables 2 and 3, respectively. Position of insertional sites of eight albicidin- 

defective mutants determinated by sequencing are indicated by vertical arrows . The location and 
direction of putative ORFS identified in the XALBl gene cluster are shown by arrow 
shapes. These twenty putative ORFs are potentially organized in four or five operons, as 
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indicated at the bottom of the figure. Patterns indicate NRPS and PKS genes (diagonal 
Crosshatch), mettiyl transferase and esterase genes Qiollow rectangles), carbamoyl 
transferase gene (fine Crosshatch), benzoate-derived products biosynthesis genes (white), 
regulatory genes (vertical lined), resistance genes (diagonal lines) and other genes with 
function of unknown significance to albicidin production (black), and three insertional 
sites of eight albicidin-defective mutants determinated by sequencing are indicated by 
vertical arrows. Dotted regions in the physical map and in ORFs represent the two 
internal duplicated DNA regions of XALB 1 . 

The sequence illustrated in Figure 1 was generated as follows. The sources of 
DNA are set out in Table 1. DNA fragments F, E, B, C, I, and G, generated by the 
digestion of cosmidpALBSTl (Rotte/a/., 1996) withfcoRI and/or J^/zI, were subcloned 
intopBCKS (+) and were sequenced from the resulting subclones, pBC/F, pBC/E, pBC/B, 
pBC/C, pBCyi and pBC/G. DNA fragment D' which corresponds to tiie part of Segment 
D present in cosmid pALBS71 was sequenced from plasmid pUFR043/D' obtained 
following self ligation of the complete EcoRI digested cosnndd pALBS7 1 . DNA fragment 
H was sequenced from pAM45.1 (Rott et al.y 1996), obtained following cloning into 
vector pBR325 of the 12kb EcoKl fragment carrying TnJ and flanking sequences from 
mutant strain XaAM45. DNA fragment A' contains the part of fragment A present in 
cosmid pALB571 and was subcloned into vector pBCKS (+) and the resulting plasmid 
pBC/A' was used for sequencing. The presence of a large internal duplication made 
alignment of sequence data obtained from pBC/A' difficult. This difficulty was resolved 
using sequence data obtained from an additional plasmid, pAM4, obtained following 
cloning into vector pBluescript II KS (+) of the 12kb EcdRI fragment carrying TnJ and 
flanking sequences from mutant strain XaAM4, which contains only one copy of the large 
internal duplication. Sequence data from pBC/A* were used to determine the first 1S42 bp 
of fragment A' between nucleotides C-19001 and G-20543. Sequence data from pAM4 
and pBC/A' were used to determine the last 4823bp of firagment A' between nucleotides 
G-21653 and G-26477. The overlapping region between nucleotides G-20469 and C- 
22159 was amplified by PGR from cosmid pALB571 using primers contigl3-1160 
(5'gcgtaccgttgtccagtagg3*) SEQ ID NO. 48 and pAM4-14 (5'gctggaaaccgagaatctga3') 
SEQ ID NO. 49, and was sequenced. Resulting sequence data were used to complete 
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sequencing of DNA fragment A*. Tlie junctions A/F, F/H, H/E, E/B, B/C, CTI, I/G, G/D . 
between corresponding DNA fragments were sequenced directly from cosmid pALBS7 1 . 
EcoRI DNA fragment containing fragments A and F was subcloned from pALBS40 into 
pBCKS (+), and the resulting plasmid pBC/AF was used to determine the part of DNA 

5 fi^^ment A which was not present in cosmid p ALB 57 1 between nucleotides 1 3 6 82 and 
G-19001. EcoRI DNA fragments J, K, L, N were subcloned from pALB540 into pBCKS 
(+) and were sequenced from resulting plasmid pBC/J, pBC/K, pBC/L, and pBC/N. The 
junctions L/K, K/J and J/A between corresponding DNA fragments were sequenced 
directly from cosmid pALB540. DNA region between nucleotides G-7517 and T-8721 

10 was amplified by PGR from cosmid pALB540 using primers E114 
(5*gacacgatcagccgctagga3') SEQ ID NO. 50 and EI4-380 (S'accagcagttgggccagcctB') 
SEQ ID NO. S 1 and was sequenced Residting sequence data were used to determine the 
sequence of fragment M and of junctions N/M and M/L. The nucleotide sequence of 
55,839 bp containing the entire major gene cluster involved in Albicidin production was 

1 5 sequenced on both strands. 

EXAMPIJB 3: Analysis of the large internal duplications in the DNA sequence 
ofXALBl 

The sequence of the 55,839 bp genomic region (SEQ ID NO. 1) contains two 
20 large internal duplications as shown by the dotted regions in the physical map of Figure 
1. A direct duplication of 1736 bp was located in DNA fragment A between nucleotides 
G-i9904 and G-21639 and between nucleotides G-23057 and G-24792. Another direct 
duplication of a 2727 bp was found in DNA fragments B and C between nucleotides C- 
40410 and G-43 136 and between nucleotides C-46644 and G-49370. Comparison of the 
25 two copies of each duplication revealed that the two copies of the 1736 bp duplication are 
identical except for one nucleotide at position 21058, and that the two copies of the 2727 
bp duplication are 98.8% identical and differ by 30 nucleotides. 
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EXAMPLE 4: Comparison of XALBl with the xabB EcoBl fragment 

Comparison of the DNA sequence of the 55,839 bp genomic region described in 
this study with the partial DNA sequence of 16,5 1 1 bp of the same region in Huang et aL, 
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2001 (described by Huang et al. as an EcoRI fragment including full length xabB from 
X. albilineans strain Xal3 [GenBank accession N' AF239749]), revealed that the DNA 
sequence from strain Xal3 over 16,511 bp is identical to the sequence from strain 
Xa23Rl, described herein, withtiie following exceptions: 1) five nucleotides are difTerent 
at positions 42963, 42972, 42980, 43014 and 43071 of the XALBl sequence, and 2) 
nucleotides from positions 43137 to 49370 are missing (internal to albl\ refer Fig. 1). 
Analysis of genomic DNA of seven strains isolated from different countries (Australia, 
Reunion Island, Kenya, Zimbabwe and USA), digested by Kpn\ and hybridized with the 
pBC/C plasmid (Table 1) labeled with ^^P, revealed that two DNA fragments 
corresponding to the XALBl fragments B and C were present in all strains (data not 
shown). This result indicated that all studied strains contain albl and not xabB because 
in albliixo pBC/C plasmid probe hybridizes with the large internal duplication present in 
both DNA fragments B and C (Figure 1). Based on this observation we postulated that 
the DNA sequence of XabB reported as fiill length by Birch in PCT WO 02/24736 Al 
(Theur seq. ID#1) appears to be incon^)lete and missing 6,234 bp of DNA sequence 
encoding 2,078 amino acids. 

EXAMPLE 5: Reading frame analysis in XALBl 

Analysis of the 55,839 bp double strand region for coding sequences revealed the 
presence of 20 open reading frames (ORFs) designated albl to albXX (Table 2 below) 
which are distributed in four groups of genes according to their position and their 
orientation in the XALB 1 cluster (Figure 1). Genes of each group may form part of the 
same operon as judged by their overlapping stop and start codons, or by the relatively 
short intergenic region which varies from 5 to 274 nucleotides. The 20 ORFs appear to 
be organized in four operons: operon 1 formed by albl - alblV; operon 2 by albV- alblX; 
operon 3 by albX- albXVI; operon 4 by albXVU - albXX. The majority of alb ORFs are 
initiated with an ATG codon, except albl and albXVIJ which are initiated with a TTG 
codon, and alblV and a/6 KT which are initiated with a GTG start codon. In seven ORFs 
of XALBl, start codons are preceded by the consensus sequence GAGG which may 
correspond to the ribosome binding site. Other ORFs are preceded by a less conserved 
sequence which contain at least three nucleotides A or G and which may serve as a weak 
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ribosome binding site. 

EXAMPLE 6: Sequencing of the Tn5 insertional site of eight tox' mutants 
previously located in XALBl 

5 Eight of the 45 X. alhilineans Tox" mutants complemented by cosmid pALB540 

and/or cosmid pALB571 and previously described ^ott el aLy 1996) were further 
analyzed. All eight mutants contain a single Tni insertion and correspond to the following 
X. albilineans strains: XaAM7, XaAMlS, XaAM45, and XaAM52 which are 
complemented by pALB571 but not by pALB540; XaAM4, XaAM29 and XaAM40 

10 which are complemented by both cosmids; and XaAMl which is complemented by 
pALBS40 but not by pALB57L The Tn5 insertional site of each Tox" mutant was 
sequenced from plasmids obtained following cloning in pBR32S or in pBluescript II KS 
(+) of the EcoRI fragments carrying Tn5 and flanking sequence using the sequencing 
primer GUSN (5*tgcccacaggccgtcgagt3') SEQ ID No. 52 that annealed 135 bp 

15 downstream from the insertional sequence IS50L of TnJ-gMsA. The sequence of the Tni 
insertional site was compared with the 55,839 bp sequence containing XALBl in order 
to determine the alb gene disrupted in each Tox- mutant albl is dismpted by the Tn5 
insertion in XaAMlS and XaAM45 at position 33443 and 34229, respectively (Figure 1). 
alblV is disrupted by the Tn5 insertion in XaAM7 and XaAM52 at position 53704 and 

20 53915, respectively. alblXis disrupted by the Tn5 insertion in XaAM4, XaAM29 and 
XaAM40 atposition21653, 23444 and24376, respectively, albXl is disrupted by the TnJ 
insertion in XaAMl at position 13301. These results are in accordance with the previous 
characterization of Tox" mutants using Southern blot hybridization (Rott ef al, 1996), 
except for XaAMl. The TnS-gusA insertion site of XaAMl was previously located in 

25 DNA fragment A (Rott et al, 1 996) but results of this study showed that this site is located 
in DNA fragment J (Figure 1). 

Table 2: Analysis of putative translational signals and location of all putative orfs 
identified in the XALBl gene cluster 

30 
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Intergenic spacing Orf 


Potential RBS° Start codon 


Stop codon 


betweCT consecutive 


(distance from start (position) 


(position) 


orf s in each putative 


codon) 




operon 







Operon 1 (strand +) 





cdbl 


GAGGG (5 b) 


TTG (30166) 


TAG 
(50805) 


45 b 


albU 


GAGGG (5 b) 


ATG (50851) 


TAA 
(51882) 


ATG overlaps TAA 


alblll 


GAGGG (7 b) 


ATG (51882) 


TGA 
(52385) 


GTG overlaps TGA 


alblV 


GAGG(7b) 


GTG (52382) 


TAA 
(55207) 


Operon 2 (strand -) 












alor 






TA A 

(29210) 


0/ 0 


Mori 


A A /"in /A UN 




np/^ A 
1 O/V 

(28262) 




CUDYll 




Allj \ jLoL\jI\}) 


T A 1^ 
1 AVJ 

(25903) 


7b 


albVUI 


AGGTG (4 b) 


ATG (25895) 


TAA 
(24903) 


20 b 


alblX 


GGTG (3 b) 


ATG (24882) 


TGA 
(19003) 


Operon 3 (strand -) 












albX 


GGGGG (8 b) 


ATG (14497) 


TGA 
(14246) 


81b 


albXI 


AGGAAA(6b) 


ATG (14164) 


TGA 
(13217) 


5b 


albXU 


GGCCTGA (5 b) 


ATG (13211) 


TAA 
(11856) 


36 b 


cdbXUI 


GGGG(3b) 


ATG (11819) 


TAA' 
(10866) 
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12 b 
41b 
208 b 

Operon 4 (strand +) 



274 b 

Overlap (17 b) 
41b 



albJOV GGAG (8 b) 

albXV GGAA(6b) 

albXVI GGAGG (4 b) 

albXVII GGGAGG(5b) 

albXnn GCTCAG(8b) 

albXIX AGG(9b) 

albXX GCAA(8b) 



ATG (10853) TAG (9363) 
ATG (932 1) TAG (7567) 
ATG (7358) TAG (7092) 



TTG (14909) TGA 

(17059) 
ATG (17334) TGA 

(17747) 
ATG (17728) TGA 

(18330) 
ATG (18372) TAG 

18980) 



Ribosomal Binding Site 



EXAMPLE 7: Homology analysis of proteins potentially encoded by XALBl 

Preliminary functional assignments of individual ORFs were made by 
comparison of the deduced gene products with proteins of known functions in the 
Genbank database. The results are set out in Table 3 below. Among the ORFs 
identified from the sequenced XALBl gene cluster, we found (i) four genes, albl SEQ 
ID No. 20, alblV SEQ ID No. 23. albVII SEQ ID No. 17 and alblX SEQ ID No. 15, 
encoding PKS and/or NRPS modules; (ii) one carbamoyl transferase gene, albXV SEQ 
ID No. 5; (iii) two esterase genes, albXI SEQ ID No. 9 and albXIII SEQ ID No. 7; 
(iv) two methyltransferase genes, albll SEQ ID No. 21 and albVI SEQ ID No. 18; (v) 
two benzoate-derived products biosynthesis genes, albXVII SEQ ID No. 1 1 and albXX 
SEQ ED No. 14; (vi) two putative albicidin biosynthesis regulatory genes, alblll SEQ 
ID No. 22 and albVIII SEQ ID No. 16; (vii) two putative albicidin resistance genes, 
MXIV SEQ ID No. 6 and albXIX SEQ ID No. 13; and (viu) two additional ORFs 
encoding proteins similar to transposition proteins, albV SEQ ID No. 19 and albXVI 
SEQ ID No. 4. No known function was found in the database for albX SEQ ED No. 
10 and albXII SEQ ID No. 8. The potential product of albXVIU SEQ ID No. 12 
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appeared to be a truncation of an enzyme with strong similarity to 4-amino-4- 
deoxychorismate lyase and branched-chain amino acid aminotransferase. Since the 
gene encoding the predicted product is roughly half the length of other such lyase or 
aminotransferase genes, albXVTII may be the result of a recombination event and may 
be non functional. 



Table 3: Deduced functions of the ORFs in the major albicidin biosynthetic cluster X 
ALBl 



Orf 


Number of 
amino acids 


Sequence bomoiog * 


Proposed 1 unction*^" 


Operon 1 








albl 


6879 


Xabb(AAK:iS074> 


Polyketide- peptide synthase 
PKS modules PKS doinains 
PKS-1 AL ACPI 
rJvo-Z Ivol IvK M^r/. AwrJ 

PKS-3 KS2 PCPl 

NRPS modules NRPS domains 
NRPS-i C A PCP2 
NRPS-2 C ^ PCP3 
NRPS-3 C A PCP4 
NRPS-4 C 


albll 


343 


XabC{AAK15075) 


C-methyltransferase 


alblU 


167 


ComAB(CAA71583) 


Activator of alb genes transcription 


alblV 


94i 


MycA(T44806) 
WbpG {E83253) 


Peptide synthase 
NRPS module NRPS domains 
NRPS-5 A PCP5 


Operon 2 








albV 


239 


Thp(AAKlS074) 


No function (transposition) 


albVI 


286 


TcmP{AAA675IO) 


O-methyltransferase 


albVII 


765 


HbaA (A58538) 


4-hydroxybenzoate CoA ligase 


aibVIII 


330 


SyrP (AAB63253) 


Regulation 


alblX 


1959 


DhbF(CAB04779) 


Peptide synthase 

NRPS modules NRPS domains 
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■ 




NRPS-6 A PCP6 
NRPS-7 C A PCP7 


Operon 3 












IVlDul XSJv^oJi 1 J 


uuKnown 


nIhYT 

ulu.AJ. 








atDJU.1 






T Tnl^nnwn 

WlUUlUWll 




317 


hp** (AAK25001) 


Esterase 


albXIV 


496 


ActII-2(p46105) 


Albicidin transporter 


albXV 


584 


hp** (08390) 


Carbamoyl transferase 


AlbXVI 


88 


OrfA(AAC03166) 


No function (transposition) 


Operon 4 








albXVII 


716 


PabAB(CAC22117) 


Para-amino benzoate synthase 










Operon 5 








albXVJII 


137 


ADCL(AAG06352) 


No function (not functional) 


albxrx 


200 


McbG (P05530) 


Immunity agamst albicidin 


albXX 


202 


UbiC (S25660) 


] 4-hydroxybenzoate synthetase 



Trotein accession numbers in Genbank are given in parentheses. 

^RPS and PKS domains are abbreviated as follows: A, adenylation; ACP, acyl earner protein, AL» acyl CoA 
hgase; C, condensation; KR, ketoreductase; KS, ketoacyl synthase; PCP, peptidyl carrier protein 
Underlined domains are likely inactive due to the lack of highly conserved motife. 
^hypothetical protein 



EXAMPLE 8: The alh PKS andi/or NRPS genes 

The potential product of a/6/, designated Albl SEQ ID No. 20, is a protein of 6879 aa 
with a predicted size of 755.9 kDa, This protein is very similar to the potential product of the 
xahB gene firom-A: alhilineam strain Xal3 from Australia (Huang et a/., 2001), but it differs 
in length and size (See Table 4 below). XabB is a protein of 4801 aa with a predicted size of 
525.7 kDa. Comparison of Albl with XabB revealed that the N-terminal regions from Met- 1 
to Ile-4325 of both proteins are identical except for five amino-acids which are Tyr-3941, Pro- 
3952, Ala-4054, Ala-4271 and Gbi-4284 in Albl and His-3941, Ala-3952, Val-4054, Val- 
4271 and Glu-4284 in XabB. The same comparison revealed that the Albl C-terminal region 
from Arg-6404 to the stop codon is 100% identical to the XabB C-terminal region from Arg- 
4326 to the stop codon. 
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The N-terminal region (from Met-1 to Asp-3235) of Albl is 100% identical to the 
corresponding region in XabB which was previously described as similar to many microbial 
modular PKS (Huang et al, 2001). This PKS region may be divided mto three modules 
(Figure 2). Abbreviations used in the Figure are: A, adenylation; ACP, acyl carrier protein; 
. 5 AL, acyl-CoA iigase; C, condensation; KR, -ketoacyl reductase; KS, -ketoacyl synthase; 

NRPS, nonribosomal peptide synthase; PCP, peptidyl carrier protein; PKS, polyketide 
synthase; TE, thioesterase; HBCL, 4-hydroxybenzoate-CoA ligase. The question mark in the 
NRPS-2 domain indicates that this A domain is incomplete. The first module designated 
PKS-1 contains acyl-CoA ligase (AL) and acyl carrier protein (ACPI) domains. The second 

10 module designated PKS-2 contains p-ketoacyl synthase (KSl) and p-ketoacyl reductase (KR) 

domains followed by two consecutive ACP domains (ACP2 and ACP3). The third module 
designated PKS-3 contains a KS domain (KS2) followed by a PCP domain (PCPl), Apart 
flieir very high similarity with XabB, these three PKS modules exhibited the highest degree of 
overall similarity with polyketide synthases SafB and PksM ftom Myxococcus xanthus and 

15 Bacillus subtilis, respectively (Table 4). The motifs characteristic of these domains are 100% 

identical to those of XabB which were previously aligned with those from other organisms 
(Huang et aL, 2001). The AL domain contains the conserved adenylation core sequence 
(SGSSG) and the ATPase motif (TGD). The three ACP domains contain a 4*- 
phosphopantetheinyl-binding co&ctor box GxDS(IL), except that A replaced G in ACPI. 

20 Both KS domains contain motif GPxxxxxxxCSxSL around the active site Cys, and two His 
residues downstream from the active site Cys, in motife characteristic of these enzymes. The 
KR domain contains the NAD(P)H-binding site GGxGxLG. 

The PKS part of Albl is linked by the PCPl domain to the four apparent nonribosomal 
peptide synthase modules designated NRPS-1, NRPS-2, NRPS-3 and NRPS-4 (Figure 2). 

25 NRPS-1, NRPS-2 and NRPS-3 modules display the ordered condensation, adenylation (A) 
and PCP domams typical of such enzymes (Maiahiel et aL, 1997), and NRPS-4 consists of an 
extra C domain which may correspond to an incomplete NRPS module. Known conserved 
sequences, characteristic of the domains commonly found m peptide synthases (Marahiel et 
al, 1997), were compared to those from NRPS-1, NRPS-2, NRPS-3 and NRPS-4 (Tables 5, 6 

30 and 7). Sequences characteristic of C, A, or PCP domains are conserved in these four NRPS, 
except in A domain of NRPS-2 module, suggesting that this latter A domain may be not 
fimctional. Comparison of the four NRPS modules among themselves revealed tiiat NRPS-2, 
NRPS-3 and NRPS-4 modules were 30.7%, 94.4% and 47.5% similar to NRPS-1 module, 
respectively- Comparison with XabB revealed NRPS-2 and NRPS-3 modules were not present 



23 Application of Royer, et al. 



in XabB which contains only NRFS-1 and NRPS-4 modules (Figure 2). The dotted box in 
Figure 2 corresponds to the apparent deletion of the NRPS-2 and NRPS-3 modules in XabB 
as compared to Albl. Apart their very high similarity with XabB, Alb 1 NRPS modules 
exhibited the highest degree of overall similarity with non-ribosomal peptide synthases NosA 

5 and NosC from Nostoc sp.. 

a/i/K potentially encodes a protein of 941 aa (AlblV) with a predicted size of 104.8 
kDa. AlblV is similar to several non-ribosomal peptide synthases such as the BA3 peptide 
synthase involved in bacitracin biosynthesis in Bacilltts licheniformis (Table 4). AlblV forms 
one NRPS module designated NRPS-5 that contains only an A domain and a PCP domain 

10 ' (Figure 2). Sequences characteristic of the domains A and PCP commonly fovmd in peptide 
synthases (Marahiel et al, 1997) are conserved in AlblV (Tables 6 and 7). However the A 
domain present in AlblV differs from A domains commonly found in peptide synthases: 
conserved sequences corresponding to cores A8 and A9 in AlblV are separated by a very long 
peptide sequence of 390 amino-acids. This additional peptide sequence exhibits a significative 

15 similarity with the protein WbpG of 377 amino acids involved in the biosynthesis of a 
lipopolysaccharide in Pseudomonas aeruginosa (Table 4). 

albVU potentially encodes a protein of 765 aa (AlbVU) with a predicted size of 83.0 
kDa similar to the 4-hydroxybenzoate-CoA ligase from several bacteria and the closest 
protein (HbaA) was from Rhodopseudomonas palustris (Table 4), High similarity between 

20 AlbVn and HbaA suggests that AlbVU is a 4-hydroxybenzoate-CoA ligase and constitutes a 
fourfii PKS module designed PICS-4. The size of HbaA is smaller (539 aa) and the similanty 
between the two proteins starts only at the residue 277 of Alb VII and at the residue 28 of 
HbaA. Comparison of AlbVn sequence located upstream from residue 277 produced no 
significant alignment AlbVU, like 4-hydroxybenzoate-CoA ligases, contains some conserved 

25 sequences characteristic of the A domain commonly found in peptide synthases (Table 6). 

a/&IY potentially encodes a protein of 1959 aa (AlblX) with a predicted size of 218.4 
kDa similar to non-ribosomal peptide synthases. Known conserved sequences, characteristic 
of the domains commonly found in peptide synthases (Marahiel ei al., 1997), were compared 
with those from AlblX which forms two NRPS modules designated NRPS-6 and NRPS-7 

30 (Tables 5, 6 and 7). NRPS-6 contains only one A and one PCP domain. NRPS-7 contains the 

three domains characteristic of NRPS modules (A-C-PCP) followed by a TE domain (Figure 
2). Apart their very high similarity wifli XabB, NRPS-6 and NRPS-7 modules exhibited the 
highest degree of overall similarity and identity with non-ribosomal peptide synthases DhbF 
from B. subtilis and NosA from Nostoc sp. (Table 4). 
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XabB (4801 aa) 
PksM(4273aa) 


XabB (4801 aa) 
PksM(4273aa) 


XabB (4801 aa) 
NosA(4379aa) 


NosA(4379aa) 
PepUde synthase (5060 
aa) 


XabB (4801 aa) 
NosA(4379aa) 
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6879 














Putative 
Alb 

protein 


Albl 


PKS-1 


Oh 


PKS-3 


NRPS-1 


NRPS-2 





& U^'t^m '-i-^^'IS . 41 C3 JL 3 O £ 



oo 

i ^ 

■ CM S« 




.nil 


(%o) 




3 /-^ 
2 ^ i: 5? 
2 ^ c 


2/183 


468/468(100%) 
229/438 (51%) 




? ^ ^ 

2 5 3 
5 R 2 
§ S § 

^ lO 

m « 


CO « 

d% oo 
oo \o 




SO O 
CM P» 


o so 

O m 
OO 

§ S 

CN 
CM ^ 


468/468(100%) 
156/438(35%) 




O tf^ tf» 

— o *n 

^ Gr d 

§ a 2 

s s s 
;^ s ^ 


en — < 

i ^ 




^ 

m 

^ rn 
5 OS 

i c 

25 3 


s i 


o rNj 






o o 

■s 7 

« oo 




OO v~k 
A A 


oo 
cn 


889 bits (2297) 
240 bits (613) 




VD VD ^ 

i2 a ^ 
^ £ ^ 

cn ^ 

sis 


CO ^ 

15 

^ 




J3 

^ *o 
so 

m OO 


•g s 


AAK15074 
AAF17280 




S S 

0 tr» 

m so 

1 i ^ 


vo en 
m oo 

11 

5 5 




OO 

m 

vo cn 

8 a 




:§ 




•§ 1 s S 

1 i :§ i 
« f ^ -Bb 


1 

i<4 




1 s 
s. s ^ 
s § s 

III- 


1 

5 


XabB (4801 aa) 
NosC(3317aa) 




o B ^ 
^ 1 9 

III 


CN 

§ 

o. » 
S i2 
§ B 

s ^ 8 

8 c:. u 




?? 

TO 

if 


? 

o 

so 

CM, g 

& s 






9 








as 
<n 

CM 






1 


i 


< 




5 




NO 



CS CM 

3 i 



5? ^ 
oo o 



^ oo 



s 



I 



00 "o. 



s 

CM 



0\ OO 

i3 19 



5S Si 

^ OO 

CM r-« 




•a 



CO CL< g 



a P 



13 



e, 3 

CO CO 



I 



I 



oo oo 


- ^ 2 ^ 
^ li 1 


• 1 


■ • 






VO ^ 

^ s 


1 2 


s « s ^ 

"O CO <w 


374/608(61%) 1 


m 


o oo 

^ 

oo oo 
\o »n 


— *o 
JO vr» 

5 Pi 


m o 

o o 
Tt- m 


CO 

oo 

CO 


On 

52 
to 




VO 
»o 

?s 


O 

*o — • 

W-k VO 

VO o 


oo — 

s 5 

CN 0\ 
<N O 




NO m 

O CX5 
«0 OO 

» oo 
«o CO 


SO- S< 
— u-v 
VO HQ 


CO O 
CT\ VO 
CO CA 
fN —» 


i 

oo 
co^ 

1 


VO 
0^ 

C2 
oo 
oo 


VO 
»o 

m 


CO O 

VO S 

VO 

*o o\ 


Cp Co 

T*- ro 

CO CO 

1 i 

oo 0\ 
vn rr 


1^ ^ 


o 2 


^ f 


OS 0\ 
— ^ <N 


oo 

fO 


o 
o 

"2 


o 


O 1^ 

»n <^ 
& u 


481 bits (1239) 
3S4 bits (908) 


OO o 

n 

53 Si 
£ JS 

TJ- — 

r- W-) 
oo »o 


»n ^ 
oo CA 

VO JO 

v-i o. 


CO CO 


/— s 

Jo 

m 

a\ 

CS 


OS 

ON 


vo^ 

CO 


^ s 

OS 

^ -5 

*n VO 
CS CO 


f:? Co^ 

— ' CM 
O On 
rs — 


IT 


NO 

; 2 


i i 


11 


so O 

^ s 

1§ 




c 
o 
v 


i 


S s 
1 § 


O O 
ov t-' 
ro ro 
OO CM 


1 1 


1 

1 ^ 

>^ ^ 
^* 


? 55 

11 
1 1 


1 § 


1 


1 


1 


1 
I 

.a 

a 


% i 

^ s 

<j -2: 
"S :§ 
8 « 
Co ^ 


i 3 
.s s 

1 1 

' a, gq 


! XabB (4801 aa) 
DhbF(1278aa) 


S ?! 

OO fO 


•§ 

S 

— '3 

3 a 
§ ^ 


S J 

II 

CO S 


? 
i 
1 

m 


S 

e 
p. 

I 


1 

o 

i| 

CO ea « ^ 


1 1 

a> a. 
5 « Si S 

1 i :s ^1 a 

S § 4i .a VO 


i ^ 
^o 

" oo 

8 « S 

Pk § CO 






oo 


CO 




• 

ro 


1 


a 






i 




§ 

3 


1 




s 

5 



O OO 



so >P 



Si 

VO 

o 



i 



<N O 

m VO 



s 



00 O 
S VO 



o 
o 



IS 

VO 



OO 

1 - s 
— s 

S S S 

2 a. S 



VO R 



i 



I 



3 



i 



vn »r» 
OO r-. 
Ov in 




CI 

ft: 



8 

1*3 



ft, 



VO 




I 



1 1| i 

DC s tz; X 



• 



fc a "c "IS.* «5 ff^ rr? 



30 



Application of Royer, et al. 



Table 5 : Comparison of conserved sequences in C domains of peptide syntiiestases and 
in putative C domains of flie Alb modules 



Core Sequences conserved in peptide 
synttietases* 

CI SxAQxR(Ii/M) (W/Y)xL 



Sequence 



TYAQERIiWIiV 
STAQERMWFL 
SYAQERLWLV 
SLFQBRLWFV 
SYQQERLWFV 

RHEVLRTRP 
RHAVLRTHP 
RHBILRTRF 
RHETLRTRI 

IHHIISDGWS 
IHHIVFDGWS 
MHHIiIYDAWS 
MHHIICDGWS 

YADYALW 
YADYARW 
YADYAIW 
YADYATW 

IGFFINILPLR 
IGLFVNTIiAVR 
IGFFVNirAVR 



10 



15 



20 



25 



C2 



RHExLRTxF 



C3 MHHXISDG(W/V}S 



C4 YxD(F/Y)AVW 



C5 (I/V)GXFVNT{Q/L) (C/A)xR 



Alb module 



NRPS-l 
NRPS-2 
NRPS-3 
NRPS-4 
NRPS-7 

NRPS-l and NRPS-3 

NRPS-2 

NRPS-4 

NRPS-7 

NRPS-l and NRPS-3 

NRPS-2 

NRPS-4 

NRPS-7 

NRPS-l and NRPS-3 

NRPS-2 

NRPS-4 

NRPS-7 

NRPS-l, NRPS-3 and NRPS-4 

NRPS-2 

NRPS-7 



30 



C6 (H/N) QD < Y/V) PFE 



HQSVPPE 
HQDVPFE 
NQALPFE 
HRAIiPFE 



NRPS-l and NRPS-3 

NRPS-2 

NRPS-4 

NRPS-7 



35 



C7 RDxSRNPL 



RDSSQIPL 
RDTARNPL 
RDTSRIPL 



NRPS-l and NRPS-3 

NRPS-2 

NRPS-4 
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"^Souiced Scorn Marahiel et al., 1997 



Table 6 : Coiiq>arison of conserved sequences in A domains of peptide synthestases and in 
putative A domains of the Alb modules 



Core Sequences conserved in 
peptide synthetases* 



Sequence 



Alb module 



Al L(T/S)yxEL 



WSYAQL 
LSYAQL 
MSYGQL 
FSYRQL 
LSYAQL 



NRPS-1 and NRPS-3 

NRPS-2 

NRPS-5 

PKS-4 

NRPS-6 and NRPS-7 



A2 LKAGxAYL(V/L)P(L/l)D 



FKAGACYVPID 
SLCGAASVLID 
MKAGAAYVPID 
LAGGLVFAPIN 
LKAGGCYVPLD 



NRPS-1 and NRPS-3 

NRPS-2 

NRPS-5 

PKS-4 

NRPS-6 and NRPS-7 



A3 LAY^YTSG ( S/T) TGxPKG 



LACVMVTS6STGRPKG 
?TRTiryiVESGSLSSRLL? 
PVYCIYTSGSTGSPKG 
PAVMICTSGSTGTPKA 
LAYVMYTSGSTGRPKG 



NRPS-1 and NRPS-3 

NRPS-2 

NRPS-5 

PKS-4 

NRPS-6 et NRPS-7 



A4 FDxS 



FAVS 
FDAA 
FDLT 
FAYG 
PAIS 



NRPS-1 and NRPS-3 

NRPS-2 

NRPS-5 

PKS-4 

NRPS-6 and NRPS-7 



AS NxYGPTE 



NNYGCTE 
7AAY6NAE? 
NEYGPTE 



NRPS-1 and NRPS-3 

NRPS-2 

NRPS-5 
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YIYGCTE 



NRPS-6 and MRPS-7 



A6 GELxXxGxG ( V/L) AR6YL 



GELHVHSVGMARGYW 
np . 

GQIHIGGAGVAIGYV 
GSLWVRGNTLTRGYV 
GEVHIESLGITHGYW 



NRPS-1 and NRPS-3 

NRPS-2 

NRPS-5 

PKS-4 

NRPS-6 and NRPS-7 



A7 Y (R/K) TGDIi 



YKTGDM 
7YRTDAL? 

YASGDL 
7FDTRDL? 

YRTGDM 



NRPS-1 and NRPS-3 

NRPS-2 

NRPS-5 

PKS-4 

NRPS-6 and NRPS-7 



A8 GRxDxQVKIRGxRIBIiGEIE 



GRQDFEVKVRGHRVDTRQVE NRPS-1 and NRPS-3 

7GSLDVQSRIDDPRIDLCWE? NRPS - 2 

GRKDSQIKLRGYRIELGBIE NRPS-5 

7GRMGSAIKINGCWLSPETLB7 PKS - 4 

GRRDYEVKVRGYRVDVRQVE NRPS-6 and NRPS-7 



A9 LPXYM(I/V)P 



LPTYMLP 
?LPDYIiLP? 

LPEYMLP 
7LGKHHYP? 

LPTYMLP 



NRPS-1 and NRPS-3 

NRPS-2 

NRPS-5 

PKS-4 

NRPS-6 and NRPS-7 



AlO NQK(V/L)DR 



NGKLDR 
7HGRVDL? 

NGKVNR 
7SGKVIR? 

NGKLDT 



NRPS-1 and NRPS-3 

NRPS-2 

NRPS-5 

PKS-4 

NRPS-6 and NRPS-7 



♦SouTced from Marahiel et aL, 1997 
?: non conserved sequences 
np: not present 
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Table 7 : CompaiisoiL of conserved sequences in PCP and TE domains of peptide 
syntfaestases and in putative PCP and TE domains of the Alb modules 



Domain Sequences conserved in Sequence 
peptide synthetases* 

PCP DxFFxxLGG (H/D) S (L/I ) D-FFAVGGHSVIi 

DNFFAL66HSLS 



Aib moduie (domain) 



PKS-3 (PCPl) 

NRPS-1 and NRPS-3 
(PCP2 and PCP4) 



DNFFEIjGGHSVIj NRPS-2 (PCP3) 

DNFFELGGHSLS NRPS-5 (PCP5) 

DNFFNLGGHSLL NRPS-6 and NRPS-7 
(PCP6 and PCP7) 



TE 



6 (H/Y) SxG 



GWSS6 



NRPS-7 



*Souiced from Marahiel et al.^ 1997 



EXAMPLE 9: The alb carbamoyl transferase gene 

albXV potentially encodes a protein of 584 aa with a predicted size of 65.2 kDa. This 
protein, AlbXV, is similar to BlmD, a carbamoyl transferase involved in bleomycin 
biosynthesis in Strepiomyces vertillus (Du ei al., 2000), and to a probable carbamoyl 
transferase potentially expressed in P. aeruginosa (Table 4). High similarity of AlbXV vwth 
these proteins suggests that AlbXV is a carbamoyl transferase. 

EXAMPLE 10: The alb esterase genes 

albXI potentially encodes a protein of 315 aa with a predicted size of 35.9 kDa. This 
protein, AlbXI, exhibits low similarity to SyrC, a putative thioesterase involved in 
syringomycin biosynthesis by Pseudomonas syringae (Zhang et al.^ 1995), and to a potential 
hydrolase encoded by Streptomyces coelicolor (Table 4). Precise function of SyrC remains 
unknown but SyrC is similar to a niunber of thioesterases, including fatty acid thioesterases. 



fto^ Villon .jLC3.e£ji: 
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haloperoxidases, and acyltransfeiases that contain a characteristic GxCxG motif. The 
corresponding SyrC donsain GICAG is conserved in AlbXI which contains the sequmce 
GWCQA, except that A replaces the last G, suggesting that AlbXI may be an esterase despite 
its low overall similarity with SyrC. 

albXIU potentially encodes a protein of 317 £ia with a predicted size of 34.5 kDa. This 
protein, AlbXm, is similar to hypothetical proteins with unknown function from several 
bacteria including Caulobacter crescentus (Table 4). AlbXm and these hypothetical protems 
contain a GxSxG motif characteristic of serine esterases and thioesterases, the corresponding 
sequence in AlbXm being GHSVG. hi addition, AlbXm presents a similarity with the 2- 
acetyl-lalkylglycerophosphocholine esterase which hydrolyzes the platelet-activating factor 
in Canis familiaris (Table 4), suggesting that AlbXUl is an esterase. 

EXAMPLE 11: The alb methyltransferase genes 

albll potentially encodes a protein of 343 aa (Albll) with a predicted size of 37.7 kDa. 
albll is 100% identical to ttie xabC cistron, previously described as encoding an O- 
methyltransfinase downstream xabB (Huang et aL^ 2000a). This conclusion is based on the 
similarity of XabC with a family of methyltransferases that utilize S-adenosyl-L-methionine 
(SAM) as a co-substrate for O-methylation including TcmO protein from Streptomyces 
glaucescens (Huang et aL^ 2000a). Albll contains three highly conserved motifs of SAM- 
dependent methyltransferases, includmg the motif I involved m SAM binding (Figure 3). In 
the Figure, identical or similar amino acids (A=G ; D=E ; I=L=V) are shown in bold. 
Numbers indicate the position of the amino acid from the N-terminus of the protein. 
Abbreviations used in the Figure are: Sgl-TcmO and Sgl-TcmN, multifunctional cyclase- 
hydratase-3-O-Mtase and tetracencmycin polyketide synthesis 8-O-Mtase of Streptomyces 
glaucescens^ respectively (accession number: M80674); Smy-MdmC, midecamycin-O-Mtase 
of Streptomyces mycarofaciens (accession number: M93958); Mxa-SafC, Saframycin O- 
Mtase of hfyxococcus xanthus (accession number: U24657); Ser-EryG, erythromycm 
biosynthesis O-Mtase of Saccharopofyspora erythraea (accession number S18S33); Spe- 
DauK, carminomycin 4-O-Mtase of Streptomyces peucetius (accession number: L134S3); Sal- 
DmpM, O-demethylpuromycin-O-Mtase of Streptomyces alboniger (accession number: 
M74560); Shy-RapM, rapamycin O-Mtase of Streptomyces hygroscopicus (accession number: 
X86780); Sav-AveD, avermectin B 5-O-Mtase of Streptomyces avermitilis (accession 
number: G5921167), Sar-Cmet, mithramycin C-methyltransferase of Streptomyces 
argillaceus (accession numben AF077869); Albll, putative albicidin biosynthesis C- 
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Methyltransferase of Xanthomonas albilineans (SEQ ID No. 27) ; identical to XabC. 
accession number. AF239749). 

Comparison of Albll with the Genbank database revealed that Albll, besides 100% 
identity to XabC, exhibited the highest degree of overall identity with MtmMII, a C- 
5 methyltransferase from Streptomyces argillaceus (Table 4) involved in C-methylation of the 

polyketide chtiin for mithramycin biosynthesis, suggesting that Albll is a C-methyltransferase. 
XabC was not compared by Birch and co-workers with MtmMII (Huang et aly 2000a) 
because the MtmMII sequence was not available until recently in the Genbank database. The 
three hig^y cons^ed moti& in SAM methyltransfererases are also present m MtmMII 
10 (Figure 3), suggesting that Albll is a C-methyltransferase SAM-dependent, 

albVI potentially encodes a protein of 286 aa (Alb VI) with a predicted size of 32.1 
kDa similar to several hypothetical protein from Mycobacterium tuberculosis (Genbank 
accessions n" AAK46042, AAK48238, AAK44517, AAK46218) and from 5. coelicolor 
(Grabank accession n' CAC03631). AlbVI is also similar to the tetracenomycine C synthesis 
15 protein (TcmP) of Pasteurella multocida (Table 4). Four highly conserved motifs in TcraP 
and other O-methyltransferases are also present in AlbVI (Figure 4), suggesting that AlbVI is 
an O-methyltransferase. In the Figure, identical or similar aa (A=G ; D=E ; I=L=V ; K=R) 
are shown in bold Numbers indicate the position of aa from the N -terminus of the protein. 
Abbreviations used in flie Figure are; Sgl-tcraP, tetracenomycin C synthesis protein of 
20 Streptomyces glaucescens (accession number: C47127); Sme-PKS, putative polyketide 
synthase of Sinorhizobium meliloti (accession number: AAK65734); Pmu-tcmP: 
tetracenomycin C synthesis protein of Pasteurella multocida (accession number: AAK03406); 
Mtu-Omt: putative O-methyltransferase of Mycobacterium tuberculosis (accession number: 
AAK45444); Mlo-Hp: hypothetical protein containing similarity to O-methyltransferase of 
25 Mesorhizobium loti (accession number: BAB50127); Mtu-Hpl: hypothetical protein of 

Mycobacterium tuberculosis (accession number: AAK46042); Mtu-Hp2: hypothetical protein 
of Mycobacterium tuberculosis (accession number: AAK48238); Mtu-Hp3: hypothetical 
protein of Mycobacterium tuberculosis (accession number: AAK44517); AAK46218); Sco- 
Hp: hypothetical protein of Streptomyces coelicolor (accession number. CAC03631); AlbVI, 
30 putative albicidin biosynthesis O-Methyltransferase of Xanthomonas albilineans (this study). 

The three highly conserved motifs m SAM methyltransfererases are not present in AlbVI, 
indicating that SAM is not a co-substrate of AlbVL 
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EXAMPLE 12: The alb derived-benzoate products biosynthesis genes 

alh^fKT/ potentially encodes a protein of 716 aa with a predicted size of 79.8 kDa. This 
protein, AlbXVn, is veiy similar to the para-aminobenzoate (PABA) synthase from 
Streptomyces griseus (Table 4). This enzyme is required for the production of the antibiotic 

5 candicidin (Criado a/., 1993). 

albXVIU potentially encodes a protein of 137 aa with a predicted size of IS.O kDa. 
This protein, AlbXVm, is similar to the 4-amino-4-deoxychorismate lyase (ADCL) from P. 
aeruginosa (Table 4). The function of ADCL is to convert 4-amino-4-deoxychorismate into 
PABA and pyruvate. The lengfli of AlbXVm is smaller (Table 4) than the length of ADCL 

10 and the similarity of AlbXVin witti this protein starts only at residues 161. albXVUl is 
preceded by a small ORF encoding a sequence of 59 aa similar to the first 42 amino acids of 
ADCL from P, aeruginosa. These data suggest that albXVJII is probably a truncated form of 
albXVni and probably not fiinctional. albXVUI may, therefore, not be involved in albicidui 
biosynthesis. The region between albXVII and albXVIH was amplified by PCR from total 

15 DNA of X albiltneans Xa23Rl strain using primers ORFW (5'gcgagaggacaagctgctgc3*) SEQ 
ID No. 53 and ORFY (5'cgttgaggatgcagcgctcg3') SEQ ID No. 54 and was sequenced. 
Resulting sequence data showed that the sequence of the PCR fragment was 100% identical to 
the sequence of pALB540, indicating that the recombination of albXVIII did not occur during 
cloning of the genomic fragment in pALB540. 

20 aZ&XAr potentially encodes a protein of 202 aa with a predicted size of 22.6 kDa. This 

protein AlbXX is similar to die 4-hydroxyb6nzoate synthase potentially involved in 
ubiquinone biosynthesis hy Escherichia coli (Siebert et aL^ 1992). 

EXAMPLE 13: The alb regulatory genes 

25 potentially encodes a protein of 167 amino acids with a predicted size of 17.8 

kDa that is similar to the transcription factors ComA of different bacteria such as E, coli and 
B, licheniformis (Table 4). ComA transcription factors appear to be involved in regulation of 
antibiotic production in bacteria. In E. coli^ a gene similar to comA is present in the 
enterobactin biosynthesis gene cluster (Liu et al^ 1989). In B. subtilis, ComAB was described 

30 as a probable positive activator of lichenysin synthetase transcription (Yakimov et aly 1998) 
and a gene similar to comA was shown to be essential for bacilysin biosynthesis (Yazgan et 
al, 2001). These data suggest that Alblll regulates transcription of genes involved m 
albicidin biosynthesis. 

albVUI potentially encodes a protein of 330 aa with a predicted size of 37.7 kDa. This 
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protein, AlbVm, is very similar to the SyrP like protein from S. verticillus and to SyrP protein 
from P. syringae (Table 4). SyrP participates in a phosphorylation cascade controlling 
syringomxpin synthesis (Zhang et al, 1997) and Ae syfrP like gene was described in the S 
verticillus bleomycin biosynthetic gene cluster (Du et aL, 2000). These data suggest that 
AlbVin regulates albicidin biosynthesis in Jf. albilineans. 

EXAMPLE 14: The alb resistance genes 

a/6A7K potentially encodes a protein of 496 aa with a predicted size of 52.7 kDa. This 
protein, AlbXlV, is 100% identical to AlbF isolated from X, albilineans stram Xal3 
(GenBank Accession AF403709; direct submission by Bostock and Birch and described as "a 
putative albicidin efilux pump which confers resistance to albicidin in E, coir). AlbXlV and 
AlbF are closely related to a family of transmembane transporters involved in antibiotic 
export and antibiotic resistance in many antibiotic-producing organisms. AlbXlV and AlbF 
exhibited the highest degree of overall identity with the putative transmembrane efflux protein 
from S. coelicolor (Table 4). These data suggest that AlbXIV and AlbF may be involved in 
albicidin resistance by transporting the toxin out of the bacterial cells that produce it. 
Alternatively, AlbXIV and AlbF may simply play a role in antibiotic secretion and/or plant 
pathogenesis to effect the transport of albicidin outside of producing cells. 

a/&mr potentially encodes a protem of 200 aa with a predicted size of 22.8 kDa, This 
protein, AlbXIX, is similar to the McbG protein from E. coli (Table 4). In Enterobacteriae, 
the McbG protein, together with two other proteins (McbE and McbF), was shown to cause 
immunity to the peptide antibiotic microcin B17 which inhibits DNA replication by induction 
of the SOS repair system (Garrido et aL, 1988). McbE and McbF proteins serve as a pump 
for the raport of the active antibiotic from the cytoplasm, whereas a McbG alone also 
provides some protection: a well-<:haracterized deficient-immunity phenotype is exhibited by 
microcin B17-producing cells in the absence of the immunity gene mcbG (Garrido et aL, 
1988). The significant similarity between AlbXIX and McbG, together with the fact that 
albicidin also blocks DNA replicaition (Birch and Patil, 1985a) suggests that AlbXIX confers 
immunity against albicidin in AT. albilineans. 

EXAMPLE 15: Transposition proteins 

alb Vis 100% identical to the thp gene described in a divergent position upstream from 
xabB (Huang et aL, 2000a). The thp gene potentially encodes a protein of 239 aa displaying 
significant similarity to the IS21-like transposition helper proteins. In X albilineans strain 
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LSISS firom Australia, insertional mutagenesis of ihp blocked albicidin production, but trans- 
complementation £Eiiled, indicating the involvement in albicidin production of a downstream 
gene in the thp operon (Huang et aL, 2000a). 

a/&Al<Z potentially encodes a protein of 88 aa with a predicted size of 9.8 kDa similar 
to the transposases from several bacteria such as Xanthomonas axonopodis or Desulfovibrio 
vulgaris (Table 4). 

The presence of transposition proteins in the XALBl cluster is probably a remnant 
&om a past transposition event that may have contributed to the development of the albicidin 
XALBl cluster. 



£XAMPL£ 16: Unknown functions 

AlbX potentially mcodes a protein of 83 aa with a predicted size of 9.4 kDa. This 
protein, AlbX, is similar to an hypottieticai protein from P. aerugmosa and to the MbtH 
protein from Nfycohacterium tuberculosis. MbtH is a protein with unknown function found in 
the mycobactin gene cluster (Quadri et al, 1998). A MbtH-like protein with unknown 
function was also described in the bleomycin biosynthetic gene cluster of S. verticillus (Du et 
al.y 2000). These data suggest that AlbX is involved in albicidin biosynthesis but its function 
remains imknown. 

albXII potentially encodes a protein of 451 aa with a predicted size of 51.6 kDa. This 
protein, AlbXn, is very similar to a protein of 55 kDa encoded by the boicB gene in Azoarcus 
evansii (Table 4). This protein is a component of a multicomponent enzyme system involved 
in the hydroxylation of benzoyl CoA, a step of aerobic benzoate metabolism in Azoarcus 
evansii^ but its function remains unknown (Mohamed et al, 2001). 

EXAMPLE 17: Prediction of amino acid specificity of Alb NRPS modules 

In NRPSs, specificity is mainly controlled by A domains which select and load a 
particular amino-, hydroxy- or carboTcy-acid unit (Marahiel et aL, 1997). The substrate- 
binding pocket of the phenylalanine adenylation (A) domain of the gramicidin S synthetase 
(GrsA) from Brevibacillus brevis was recently identified by crystal structure analysis £is a 
stretch of about 100 amino acid residues between highly conserved motifs A4 and A5 (Conti 
et aLy 1997). Based on sequence analysis of known A domains, in relation to the crystal 
structure of the GrsA (Phe)substrate binding pocket, sinular models have been published to 
predict the amino acid substrate which is recognized by an unknown NRPS A domain (Challis 
et aLy 2000; Stachelhaus et aL, 1999). These models postulate specificity-conferring codes for 
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A domains of NRPS consisting of critical amino acid residues putatively involved in substrate 
specificity. The model proposed by Marahiel and co-workers (Stachelhaus et al, 1999) 
defined a signature sequence consisting of ten amino acids lining with the ten residues of the 
phenylalanine-specific binding pocket located at positions 235, 236, 239, 278, 299, 301, 322, 
330, 331 and 517 in the OsrA (Phe) sequence (accession number: P14687), The model 
proposed by Townsend and co-woikers (Challis et aL, 2000) uses only the first eigjit of these 
critical residues. 

Preliminary specificity assignments of albicidin synthase Albl, AlblV, AlbVII and 
AlbDC NRPS modules were made by comparison of complete sequences between conserved 
motifs A4 and A5 with sequences in the Genbank database. The corresponding sequence of 
the AlbrV NRPS-5 module is most related to domain 5 of bacitracin synthase 3 (BA3) from B. 
licheniformis that was suggested to activate Asn (Konz et al., 1997). Corresponding 
sequences of AIM and AlbDC NRPS-1, NRPS-3, NRPS-6 and NRPS-7 modules, apart from 
their very high similarity with XabB, exhibited the highest degree of overall identity (39%) 
with tiie Bbn NRPS2 module of the biosynttietic gene cluster for bleomycm from S, verticillus 
that specifies for p-Alanine (Ehi et al, 2000). The corresponding sequence of AlbVII PKS-4 
produced the highest significant alignment with acetate-CoA ligase from Sulfolobus 
solfataricus (Genbank accession number AAK41550), aryl-CoA ligase from Comamonas 
testosteroni (Genbank accession number AAC38458) and 4-hydroxyben2oate-CoA ligase 
from R. paltistns. The sequence between motife A4 and A5 of the Albl NRPS-2 could not be 
significantly aligned with any sequence present in the Genbank database. Comparison of this 
sequence with the correspondmg sequence of GrsA (Phe) revealed that parts of the putative 
core and structural "anchor" sequences of Albl NRPS-2 are deleted (Figure 5), suggesting that 
the Albl NRPS-2 substrate binding pocket is not ftinctional. In the Figure, amino acids of the 
sfac Alb NRPSs and of Alb PKS-4 that are identical or similar to GrsA or Blm sequences 
(A=G; D=E; I=Lf=V; R=K) are shown in bold. Amino acids underiined in the GsrA sequence 
correspond to the phenylalamne-specific binding pocket. The positions of these amino acids 
in the GrsA primary sequence are indicated at the top of the figure. Amino acids underiined 
in the other sequences correspond to putative constituents of binding pockets, aligned with the 
seven residues of the phenylalanine- specific binding pocket of GrsA. Shaded amino-acids 
correspond to the putative core sequences and structural 'anchors' based on comparison with 
the GrsA binding-pocket structure. 

Alignment of the primary sequence between conserved motife A4 and A5 of the Albl, 
AlblV, AlbVn and AlbDC NRPS-1, NRPS-3, NRPS-5, NRPS-6, NRPS-7 and PKS-4 modules 
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with the conesponding sequence of GrsA (Phe) (Figure 5) revealed the putative constituents 
of binding pockets that constitute the codes as defined by Marahiel and co-workers 
(Stachelhaus et aly 1999). These codes were compared with those of proteins most related to 
the sequence between the A4 and A5 motife (Table 8) and were analyzed with the model 
proposed by Townsend and co-workers (Challis et a/., 2000, 
http://jhunix±cfjbu.edu/-<ravel/mps//). Using tiiese codes, we were able to predict the 
Asparagine specificity of the AlblV NRPS-5 module. The AlblV NRPS-5 signature is 100% 
identical to BacC-M5 (Asn) and TyrC-Ml (Asn) codes identified in bacitracin synthetase 3 
fi^om B, lichenifonnis and in tyrocidine synthetase 3 from B. brevis (Table 8). The AlblV 
NRPS-5 signature is also identical to the Asn code defined by Marahiel and co-workers 
(1997), except that I is replaced by L at position 299 (Table 8). The Albl and AlbK NRPS-1, 
3, 6 and 7 signatures did not match any of those defined by Marahiel and co-workers (1997). 
Similarly, convincing predictions using the model proposed by Townsend and co-workers 
were not obtained either (Challis et al., 2000, http://jhumx.hcf.jhu.edu/~ravel/nrps//). The 
Albl and AlblX NRPS-1, 3, 6 and 7 signatures diverged firom all NRPS signatures previously 
described, except fi-om the XabB signature that is identical to the Albl NRPS-1 and 3 
signatures. The signature most closely related to Albl NRPS-1 and 3 specify Pro and the 
signature most closely related to AlblX NRPS-6 and 7 specify Ser, but the degree of 
similarity in both cases is very weak (Table 8). The PKS-4 signature is similar to the Albl 
NRPS-1 and NRPS-3 signatures at positions 235, 299 and 301. 

Analysis of alignment of the primary sequence between conserved motifs A4 and A5 
of the Albl and AlbDC NRPS-1, NRPS-3, NRPS-6 and NRPS-7 modules with the 
correspondmg sequences of the bleomycin synthase (Blm) NRPS2 (p-Ala) and gramicidin S 
synthetase (GrsA) modules (Figure 5) revealed that (i) sequences of Albl NRPS-1 and Albl 
NRPS-3 differ only at the level of two residues that are not involved in substrate binding, (ii) 
sequences of AlblX NRPS-6 and AlbK NRPS-7 are 100% identical, (iii) sequences of Albl 
NRPS-1 and Albl NRPS-3 are very similar to sequences of AlblX NRPS-6 and AlbK NRPS- 
7 but dififer at the level of five putative constituents of binding pocket, (iv) Albl and AlbK 
NRPS residues, which are similar to residues of Blm NRPS2 (p-Ala) or GrsA (Phe), are 
essentially located at the level of the putative core sequences and structural •*anchor", and 
differ at the level of putative constituents of the binding pocket. 

Binding-pocket constituents forming the NRPS signatures have been classified into 
three subgroups according to their variability among 160 specificity^-conferring signature 
sequences (Stachelhaus et al, 1999): (i) invariant residues Asp23S and Lys517 that mediate 
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key. interactions with the a-amino and a-carboxylate group of the substrate, respectively; (ii) 
moderately variant residues in positions 236, 301 and 330 which correspond to aliphatic 
amino acids and which may modulate the catalytic activity and fine-tune the specificity of the 
corresponding domains; (iii) highly variant residues in positions 239, 278, 299, 322 and 33 1 
5 which may facilitate substrate specificity. Albl and AlblX NRPS-1, 3, 6 and 7 signattires are 
not totally in accordance with this classification. Invariant residue Lys517 is conserved in the 
four NRPS signatures, indicating the presence of an a-carboxylate group in the corresponding 
substrates. The Asp23SAia alteration is not consistent with an o&-amino acid substrate. Birch 
and co*workers (Huang et aL, 2001) assumed that the initial alanine residue in the XabB 

10 signature was consistent with a nonproteinogenic hydroxy acid substrate by analogy with the 
initial glycine in the signature of the hydroxyisovaleric-acid (HVCL) loading domain of 
enniatin synthetase. The presence of an initial Alanine in the AlbVIl PKS-4 signature (Figure 
8) and in several 4-hydroxybenzoate-CoA ligase codes may confirm this hypothesis. 
However, the HVCL loading domain of emiiatin synthetase (Table 8) and Alb VII PKS-4 are 

15 not preceded by a C domain and are not followed by a PCP domain, in contrast to the Albl 
and AlbDC NRPS-1, 3, 6 and 7 modules. An Asp235Val alteration was recently described in 
the P-Ala specificity-conferring code (Du et al, 2000, Table 8), suggesting that the substrate 
of Albl and AlbDC NRPS-1, 3, 6 and 7 modules may be different firom a-amino acids but may 
contain an amino group. Residue 236 is an aliphatic residue (Val or He) in all Albl and AlbDC 

20 NRPS-1, 3, 6 and 7 signatures. Residue 301 is an aliphatic residue (Ala) in the Albl NRPS-l 
and 3 codes, but it is a Ser in the AlbDC NRPS-6 and 7 signatures. Residue 330 is not an 
aliphatic residue in the four NRPS signatures but an Asp. Similar alterations are present in the 
p-Ala code: residue 236 is an Asp, residue 301 is a Ser and residue 330 is an aliphatic amino 
acid. Concerning highly variable residues, Albl NRPS-1 and 3 signatures differ from AlblX 

25 NRPS-6 and 7 signatures at residue positions 299, 322 and 331, confirming that both types of 

NRPS modules specify difierent substrates. 

Table 8 : Comparison of signature sequences, as defined by Marahiel and co-woricers 

(Stachelhaus et al, 1999), derived firom sequences between the A4 and A5 domains of the 
30 Albl, AlblV, and AlbDC NRPS modules with those of Tyr-Ml (Pro) (Tyrocidine synthetase 2 
module 1, accession number: AAC45929), VirS (Pro) (Virginiamycin S synthetase, accession 
number : CAA72310), HVCL (hydroxyisovaleric acid-CoA ligase, ACLl enniatin synthetase, 
accession number: S39842), EntF-Ml (Ser) (Enterobactin synthase, accession number: 
AAA92015), P-Ala code (P-Ala selectivity-conferring code defined by Du et al , 2000), 
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BacC'MS (Asn) (Bacitracin synthetase 3, accession numben AAC06348), TyTC-Ml .(A5n) 
(Tyrocidine synthetase 3, accession number: AAC4S930) and Asn code (Asn selectivity- 
conferring code defined by Marahiel and co-workers (Stachelhaus et al, 1999). Amino acids 
of Albl NRPS-1 and NRPS-3 signatures identical or similar to TyrB-Ml (Pro), VirS (Pro) and 
5 HVCL signatures (A=G; D=E; I=L=V; R=K) are shown in bold. Amino acids of AlbDC 

NRPS-6 and NRPS-7 signatures id»tical or similar to Vir (Pro) and Blm 0-Ala) signatures 
(A=G; 1>=E; I=I^V; R=K) are shown in bold. Variability: 0 indicates invariant residues, +/- 
moderately variant residues and -H- highly variant residues. 



10 Position in GsrA (Phe) and variability 
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30 

EXAMPLE 18: Identification of putative promoters and putative terminators in 
XALBl 

Putative rho independent terminators were identified downstream from alblV and 
albXVI using the Terminator program (Brendel and Trifonov, 1984), run with the Wisconsin 
35 Package™ GCG software (Figure 6). In the Figure, dashes indicate palindromic sequences. 

Symbols used in the Figure are: P, Primary structure value of putative terminator (minimum 
threshold value of 3.5 represents 95 percent of known, factor-independent, prokaryotic 
terminators); S, Secondary structure value of putative terminator. The presence of these 
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terminators confirmed the proposed genetic organization of operons 1 and 3. A rho- 
independent terminator was identified in the intergenic region between albXVII and albXVIII, 
suggesting that the group of genes initially supposed to be organized in operon 4 may be in 
feet organized in two operons, operon 4 formed by albXVII and operon 5 by albXVUI - 
5 albXX, No putative rho independent terminator was found downstream fi-om alblX and fcom 

albJOC. 

The 236 bp region between albl (operon 1) and albV (operon 2) is 100% identical to 
the sequence between xabB and thp genes that is assimied to contain a bidirectional promoter 
(Huang et aL, 2000a and 2001), suggesting that transcription of operon 1 and 2 is regulated by 

10 the same bidirectional promoter region (Huang et aL, 2001). 

The 412 bp region comprised between albX (operon 3) and albXVII (operon 4) also 
contains a putative bidirectional promoter (Figure 7). In the Figure, the sequence of putative 
promoters are underlined, and putative ATG or TTG start codons are in bold. The closest 
matches (TTGACA-18x-TATAGT) to the consensus -35 (TTGACA) and -10 (TATAAT) 

15 sequences for coli a'^O promoters occurs 61 bp upstream from albX {operon 3). The closest 

matches (TTCAGA-19x-TATACA) to the consensus sequences for E, coli o^O promoters 
occur 320 bp upstream from albXVII (operon 4). The region between albXVII and albXVIII 
lacks any apparent E, coli a^O promoter. However, the sequence immediately upstream from 
albXIX^ corresponding to the coding sequence of albXVIII, potentially contains an 

20 unidhwtional promoter (Figure 7) . The closest match (TTGCTC-19x-TATATT) to the 
consensus sequences for E. coli o^O promoters occurs 33bp upstream firom albXIX. The 
presence of a terminator downstream from albXVII and of a promoter upstream firom albXIX 
suggests that albXVIII is not transcribed and that albXIXdcid albXX form operon 5. 

25 EXAMPLE 19: aoning of the XALB2 gene cluster 

The 6 kb EcoK I fi:agment carrying Tn5 and flanking sequence from strain AM37 was 
cloned in pBR325 and the obtamed plasmid was designated pAM37 (Table 1). A 1.1 kb Hind 
TOrHind m DNA fragment from pAM37, named PR37 (Table 1), was labeled with and 
used to probe the 845 clones from the genomic library of ^. albilineans strain Xa23Rl, 

30 previously described (Rott et al., 1996). Eight new cosmids hybridized to this probe and 

restored albicidin production in mutant AM37. One of these cosmid, pALB389, carrying an 
insert of about 37 kb (Table 1), was used for complementation studies of the five mutants not 
complemented by pALB540 and pAUB571. Cosmid pALB389 complemented mutants AMIO 
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and AM37. Mutant AMIO was initially thought to be complemented by pALB639 (Rott et 
al., 1996). However, further complementation studies showed that mutant AMIO was not 
complemented by pALB639^and that only three mutants (AM12, AM13 and AM36) were 
complemented by pALB639 containing the third genomic region XALB3 involved in 
albicidin production. A 3 kb EcoRl' EcoRl DNA fragment from pALB389 that hybndized 
with probe PR37 was sub-cloned in pUFR043 (Table 1). The resulting plasmid pAC389.1 
complemented mutants AMIO and AM37, confirming that the second region involved in 
albicidin production, XALB2, was present in the 3 kb insert of pAC389.1. 

EXAMPLE 20: Cloning of the XALB3 gene cluster 

Cosmid pALB639, carrying an insert of 36 kb (Rott et al., 1996; Table 1) was used as 
a probe to conxpaie the EcoEl restriction profiles of Al albilineam strain Xa23Rl with those 
of mutants AM12, AM13 and AM36 which were supposed to be mutated in the XALB3 gene 
cluster. An 11 kb band which was found in strain Xa23Rl but not in the three mutants was 
supposed to contain the XALB3 gene cluster, A 9.7 kb Eco^ DNA fra^ent purified from 
cosmid pALB639 also used as a probe in Southern blot analyse revealed the same 1 1 kb 
band. This 9.7 kb Eco¥l DNA fragment was sub-cloned in pUFR043 (Table 1) and the 
resulting plasmid pAlb639A complemented mutants AM12, AM13 and AM36. The third 
region involved in albicidin production, XALB3, was therefore present in the 9.7 kb insert of 
pAlb639A. 

EXAMPLE 21: Sequencing of the Tn5 insertional site of tox" mutants located in 
XALB2 and XALB3 and sequencing of the genomic regions XALB2 
and XALB3 

In Figure 8, E, H, Sa and S indicate restriction endonudease cut sites for EcoRl^ 
HmdUI^ Sail and 150X13 Al, respectively. The DNA inserts carried by plasmids pAC389.1, 
pALB639A or pEV639 are represented by the bars at the top of the respective figures. 
Positions of the Tn5 insertional sites of mutants AMIO, AM12, AM36 and AM37 were 
determined by sequencing and are indicated by vertical arrows. The DNA region 
corresponding to the Tni flanking regions in pAMlO, pAM12.1, pAM36.2 and pAM37 and m 
the PR37 DNA fragment are represented by the bars at the bottom of the respective figures. 
The location and direction of albXXI and albXXlI are indicated by thick black arrows. The 
location of other orfs in XALB2 similar to those described by Huang et al. (2000b) are 
indicated by thick white arrows. 
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Hie 7 kb EcoR I fragment canying TnJ and flanking sequence from strain AMIO was 
cloned in pBluescript n KS (+), and the obtained plasmid was designated pAMlO (Table 1). 
The sequences between EcoRl sites and the Tn5 insertional site of mutants AMIO and AM37 
were sequenced from the resulting plasmids pAMlO and pAM37, respectively. The complete 
5 double-strand nucleotide sequence of the 2,986 bp EcoR I - EcoR 1 insert of pAC389.1 was 

determined from sequencing results of plasmids pAC389.1» pAMlO and pAM37 (Figure 8). 
The Tn5 i^ertional sites of mutants AMIO and AM37 were sequenced from plasmids pAMlO 
and pAM37 (Table 1), respectively, using the sequencing primer GUSN 
(5'tgcccacaggccgtcgagt3') that annealed 135 bp downstream from the insertional sequence 

10 IS50L of Tn5-gusA. The TnJ insertional site of AMIO and AM37 was located at position 
2 107 and 1 882, respectively. 

The EcoRl fragments carrying TnJ and the flanking sequences from mutants AM 12 
and AM36 were cloned in pBR32S (Rott et al., 1996; Tablel), The sequences between EcoRi 
site and the TnJ insertional site of mutants AM12 and AM36 were sequenced from the 

15 resulting plasmids pAM12.1 and pAM36.2, respectively. The complete double-strand 
nucleotide sequence of the 9,673 bp EcoR 1 - SauiA I insert of pALB639A was determined 
from the sequencing results of plasmids pAM12.1, pAM36.2 and pALB639A (Figure 8). The 
TnJ insertional site of mutants AM12 and AM36 was sequenced from plasmids pAM12.1, 
pAM36.2 using the sequencing primer GUSN (5'tgcccacaggccgtcgagt3') that annealed 135 bp 

20 downstream from the insertional sequence JS50L of TnS-gusA. The TnJ insertional site of 
AMI 2 and AM36 was located at position 6500 and 7232, respectively (Figure 8). 

EXAMPLE 22: Homology analysis and genetic organization of XALB2 (Figure 8). 

The sequence of 2986 bp containing XALB2 is 99.4% identical to the sequence of 

25 2989 bp containing xabA described in X, albilineans strain LSI 55 from Australia (Huang et 
. al., 2000b; accession number AF191324). The TnJ insertional site of mutant LS156 described 
in xabA is 15 bp upstream from the msertional site of AM37. The orf disrupted in AM37 and 
AMIO, designed albXXI, is identical to xabA except a C which replaces a T at position 1642. 
albXXI potentially encodes a protein of 278 aa with a predicted size of 29.3 kDa which is 

30 100% identical to the potential product of xabA, described as a phosphopantetheinyl 

transferase (Huang et al., 2000b). Region XALB2 contains three additional orfs (orfl, orf2, 
and orf3) similar to those described by Huang et al., (2000b; orf, rsp6 and aspT). or£2 and orf3 
are 100% identical to rsp6 and aspT respectively, and orfl is similar to but smaller than orf. 
There are no close matches to the E. coli ylO promoter -10 (TATAAT) and -35 (TTGACA) 



I 
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consensus sequence, and no putative RBS site upstream fiom the putative start codon ATG of 
albXXI. The putative factor-independent transcription site described at 42 bp downstream 
fiom die TGA stop codon of xabA (Huang et al., 2000b) is also present at the same position 
downstream from albXXL 

5 

EXAMPLE 23: Homology analysis and genetic organization of XALB3 (Figure 8). 

The orf disrupted in mutants AM12 and AM36 was located between nucleotide 6090 
(ATG) and 8009 (TAA) and was designed albXXU. The first ATG at position 6090 is not 
preceded by a putative ribosome binding sequence, suggesting that the start codon is the ATG 

10 at position 6105 which is preceded at position -9 by the putative ribosome binding site 

sequence GGAG. A putative rho independent terminator was identified at position 8082, 73 b 
downstream from albXXU (figure 6). There are no close matches to E, coli promoter -10 
(TATAAT) and -35 (TTGACA) consensus sequence upstream from the putative start codon. 
The Sail DNA fragment corresponding to DNA sequence fiom nucleotide SS 10 to nucleotide 

15 8124, which contains the 595 bp upstream from the putative start codon, the orf albXXJl and 

die putative rho independent terminator, was sub-cloned in pUFR043 in the opposite direction 
to LacZ (Table 1). The resulting plasmid pEV639 (table 1) complemented mutants AM12, 
AM13 and AM36, confirming that (i) the third region involved in albicidin production, 
XALB3, was present in the insert of pEV639; (ii) albXXU is not transcribed as part of a larger 

20 operon ; and (iii) the 595 bp upstream the putative start codon contain a promoter. 

The potential product of albXXII, designated AlbXXII, is a protein of 634 aa with a 
predicted size of 71.5 kDa. This protein is very similar to the heat shock protein HtpG from 
Pseudomonas aeruginosa (identities = 82%) and from Escherichia coli (identities = 
60%)(table 4). The methionine encoded by the putative start codon at position 6105 was 

25 aligned with the first aminoacid of the heat shock protein HtpG from Pseudomonas 

aeruginosa, confirming that albXXII initiates at position 6105. 

The invention includes the isolation and sequencing of a region of 55,839 bp from X. 
albilineans strain Xa23Rl containing the major gene cluster XALBl involved in albicidm 
30 production. Analysis of this region allowed us to predict the genetic organization of the gene 
cluster XALBl which contains 20 ORFs grouped in four or five operons (Figure 1). Because 
albXVIII is a truncated gene, XALBl genes may be organized in five operons. Therefore, we 
will from now on consider albXVU as part of operon 4 and albXIX and albXX as part of 
operon 5. Similar operon-type organizations for antibiotic biosynthesis clusters are well 
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known and have been postulated to &cilitate cotranslation of genes within the operon to yield 
equimolar amounts of proteins for optimal interactions to fonn the biosynthesis complexes 
(Cane, 1997). Overlappmg genes involved in the same process are also quite common in 
bacteria (Normark e/ a/., 1983). 

Previous results of transposon mutagenesis and complementation studies (Rott et aLy 
1996; Rott, unpublished results) are in accordance with the predicted genetic orgamzaUon of 
XALBl described in this study, and allowed us to establish that operons 1, 2 and 3 are 
involved in albicidin biosynthesis: (I) Tox" mutants witfi a TnS-gusA insertion site located in 
DNA fragments B, C, G and D were complemented by cosmid pALB571 and not by cosmid 
pALB540, confirming that cosmid pALB571 potentially contams the entire operon 1; (ii) Tox 
mutants with a Tn5-gz^^ msertion site located in DNA fragments A and H were 
complemented by both cosmids pALB540 and pAI3571, confirming that both cosmids 
potentially contain the entire operon 2; (iii) mutant XaAMl with a InS-gusA insertion site 
located in DNA firagment J is the only TnJ Tox* mutant complemented by cosmid pALB540 
and not by cosmid pALB571, confirming that cosmid pALB540 potentially contains the entire 
operon 3. Our mutagenesis studies did not confum that operons 4 and 5 are required for 
biosynthesis of albicidin. The para-aminobenzoate (PABA) is required for the growth of many 
bacteria probably including X. albilineans, suggesting that a mutation in albXVll may be 
lethal and explaining why we did not obtain any mutant disrupted in this gene. 

Putative bidirectional promoters were identified between operons 1 and 2 (Huang et 
aL, 2001) and between 3 and 4 (Figure 7), confirming ttie prediction of genetic organization 
of XALBl. The region upstream from operon 1 is 100 % identical to the region upstream 
from the xabB start codon which was described as a functional promoter during the phase of 
albicidin accumulation in Australian strain Xal3 of X. albilineans (Huang et al.^ 2001). 
Involvement of several operons in albicidin biosynthesis suppose that they are transcribed 
during die same time. Promoter activities of regions upstream from putative operons 2, 3, 4 
and 5 need to be determined to precise if these promoters are fimctional during the same 
growth phase of albilineans as the promoter upstream from operon 1 . 

Potential rho-independent transcription terminators were identified downstream from 
operons 1, 3 and 4 (Figure 6) confirming prediction of the genetic organization of these three 
operons. Because operons 2 and 5 are convergent (Figure 1) and separated by a very short 
region of 22 bp between alblXmd albXX, stop codons may allow transcription termination in 
the absence of sequences corresponding to potential rho-independent transcription terminators 
downstream from these operons. It is quite possible that sunultaneous transcription of 
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operons 2 and S involving the presence of two RNA polymerases (one on each strand of 
DNA) may cause RNA polymerases to pause at the end of each operon because of steric 
interference between both polymerase con^lexes in the same short region. 

The presence of putative RBSs upstream of the ATG start codons of all ORPs, except 
for albXVIII, suggests that these ORFs are translated in X. albilineans. The absence of a 
canonical RBS upstream firom albXVUI further indicates diat this ORF is probably not 
expressed. GTG and TTG codons (usually valine and leucine codons) generally serve as 
procaiyotic start codons when located near the 5' end of an RNA message, but GTG start 
codons were also described far from the 5* end of messenger RNA in the bacitracin 
biosynthesis cluster of lucheniformis (Genbank accession n" AF 184956) or in the 
bleomycin biosynthetic gene cluster of 5. verticillus (Genbank accession n* AF210249). This 
is in accordance with the fact that the two potential TTG start codons are the first start codons 
in operons 1 and 4 of XALBl, and that fte two potential GTG start codons initiate internal 
cistrons. The albl and albXVU g^es, like xabB (Huang et aL, 2001), use TTG as a start 
codon, which may impose post-transcriptional control of the rate of gene product formation 
(McCarthy and Gualerzi, 1990). 

The predicted genetic organization of operons 1 and 2 presents similarities with the 
organization of the region involved in albicidin production in strain Xal3 of X. albilineans 
from Australia (Huang et al 2000a, Huang et al., 2001). This latter region also contains two 
divergent operons involved in albicidin production, one comprising the xabB gene (similar to 
albl, but with a large deletion) and the xabC gene (100% identical to albll) and the other 
containing thp gene (100% identical to albV). In addition, the sequence between the two 
operons in strain Xal3 is 100% identical to the sequence between operons 1 and 2, indicating 
fliat both clusters are controlled by the same bidirectional promoter. However, transposon 
mutagenesis studies of Xal3 showed no evidence of another cistron downstream of xabC that 
may be involved in albicidin production (Huang et al,, 2000a), suggesting that the Xal3 xab 
operon differs from the Xa23Rl operon 1, which contains two additional genes downstream 
firom albll that are potentially involved in albicidin production (alblll and albJV; refer Figure 

Homology analysis revealed that four NRPS and/or PKS genes are present in XALBl 
(Figure 2), and these genes may be involved in the biosynthesis of the albicidin polyketide- 
polypeptide backbone (albl, alblV, albVU and alblX). NRPS and PKS enzymes are generally 
organized into repeated fiinctional units known as modules, each of which is responsible for a 
discrete stage of polyketide or polypeptide chain elongation (Cane and Walsh, 1999). Each 
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PKS or NRPS module is made up of a set of three core domains, two of which are catalytic 
and one of which acts as a carrier, and together are responsible for the central chain-building 
reactions of polyketide or polypeptide biosynthesis. Both PKS and NRPS core domains utilize 
analogous acyl-cham elongation strategies in which the growing chain, tethered as an acyl-S- 
enzyrae to the flexible 20 A long phosphopantetheinyl arm of an acyl carrier protein (ACP) 
or peptidyl carrier protein (PCP) domain, acts as the electrophilic partner that undergoes 
attack by a nucleophilic chain-elongation unit, a malonyl- or aminoacyl-S-enzyme derivative, 
respectively, itself covalently bound to a downstream ACP/PCP domain. In the case of a 
PKS, the fundamratal chain-elongation reaction, a C-C bond-forming step, is mediated by a 
ketosynthase (KS) domain that catalyzes the transfer of the polyketide acyl chain to an active- 
site cysteine of the KS domain, followed by condensation with the methylmalonyl- or 
malonyl-S-ACP by a decarboxylative acylation of the malonyl donor unit. An additional 
essential component of the core PKS chain-elongation apparatus is an associated 
acetyltransferase (AT) domain, which catalyzes the priming of the donor ACP sidearm with 
the appropriate monomer substrate, ususdly methylmalonyl- or malonyl-CoA. The comparable 
core domains of an NRPS biosynthetic module function in a chemically distinct but 
architecturally and mechanistically analogous fashion. In the latter case, the key chain- 
building reaction, a C-N bond-forming reaction, involves the generation of the charactenstic 
peptide bond by nucleophilic attack of the amino group of an amino acyl-S-PCP donor on the 
acyl group of an upstream electrophilic acyl- or peptidyl acyl-S-PCP chain, catalyzed by a 
condensation (Q domam. In functional analogy to the PKS AT domain, the core of the NRPS 
module utilizes an adenylation (A) domain to activate the donor amino-acid monomer as an 
acyl-AMP intermediate, which is then loaded onto the downstream PCP side chain. Both the 
AT and A domains of the respective PKS and NRPS modules act as important gatekeepers for 
polyketide or polypeptide biosynthesis, exhibiting strict or at least high specificity for their 
cognate malonyl-CoA, methyhnalonyl-CoA or amino acid substrates. In addition to the basic 
subset of core domains, each PKS or NRPS also has a special set of dedicated domains 
responsible both for the initiation of acyl-chaui assembly by loading of a starter unit onto the 
first, furthest upstream PKS/NRPS module, as well as a chain-terminating thioesterase (TE) 
domain, most often found fused to the last module, that is responsible for detachment of the 
most downstream covalent acyl enzyme intermediate and off-loading of the mature polyketide 
or polypeptide chain (Cane and Walsh, 1999). 

XALBl potentially encodes four PKS modules and seven NRPS modules. Most of the 
bacterial NRPS gene clusters described up to now are organized in operon-type structures. 
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encoding multi modular NRPS proteins wifh individual modules organized along the 
chromosome in a linear order that parallels the order of amino acids in the resultant peptide, 
following the "colinearity rule" for the NRPS-template assembly of peptides from amino 
acids (Cane, 1997; Cane et aL, 1998; Cane and Walsh, 1999; von Dohren et a/., 1999). PKS 
5 and NRPS modules are apparently not organized according to this *'colinearity rule" for ' 
albicidin biosynthesis because of the following features : (I) NRPS and PKS genes are 
expressed in two divergent operons; (ii) no AT domain was identified in PKS-2 and PKS-3 
domains, suggesting involvement of a separate enzyme ; (iii) the A domain of NRPS-2 is not 
functional, suggesting the involvement of a /raw^-acting A domain ; (iv) a single chain- 

10 terminating TE domain was identified in XALBl which may be responsible of the release of 

the full length albicidin polyketide-polypeptide backbone from the enzyme complexes. 
Exception to the "colinearity rule" has also been shown for the syringomycin synthetase of P. 
syringae (Guenzi et aL, 1998), for the exochelin synthetase of Mycobacterium smegmatis (Yu 
et aLy 1998) and for the bleomycin synthetases of S. verticillus (Du et aL, 2000). 

15 On the basis of the deduced functions of individual NRPS and PKS domains we have 

aligned the four PKS and the seven NRPS modules to suggest two different putative linear 
models for the synthesis of the albicidin polyketide-peptide backbone (Figure 9). in the 
Figure, NRPS and PKS domains are abbreviated as follows: A, adenylation; ACP, acyl carrier 
proteiii; AL, acyl-CoA ligase; AT, acyltransferase; C, condensation; HBCL,hydroxyben2oate- 

20 CoA ligase; KR, ketoreductase; KS, ketoacyl synthase; PCP, peptidyl carrier protein. Asn 
designates asparagine. XI and X2 indicate substrates incorporated by NRPS -1 and 3 and by 
NRPS-6 and 7, respectively. The crossed A domain in NRPS-2 indicates that this deleted 
domain may be not functional. In model 1, (Figure 9A), (i) the PKS-1 module alone is 
responsible for the initiation of the acyl-chain assembly^ (ii) PKS-4 (HBCL) interacts with 

25 PKS-2 and PKS-3 as an AT domain to allow acyl transfer and (iii) NRPS-5 interacts with only 
NRPS-2. In model 2 (Figure 9B) two different modules, PKS-1 and PKS-4, are responsible 
for this initiation step. Model 2 leads to the biosynthesis of four different polyketide- 
polypetide backbones; in this model (i) PKS-1 (AL) and PKS-4 (HBCL) are in competition 
for initiation of albicidin precursors; (ii) a separate AT enzyme (potentially AlbXlIl) interacts 

30 with PKS-2 and PKS-3 to allow acyl transfer; (iii) NRPS-S interacts with NRPS-2; and (iv) 

NRPS-5 and NRPS-6 are in competition for interaction with NRPS-4. 

Both models are based on the fact that PKS-1 contains the AL and ACPI domains, and 
PKS-4 shows homology with the hydroxybenzoate-CoA ligases. In other PKS systems, an N- 
terminal AL domain is involved in the activation and incorporation of an 3,4-dihydroxycyclo 
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hexane caxboxylic acid, a 3-amino-S-hydroxybeiizoic acid or a long-chain fatty acid as a 
starter (Aparicio et al, 1996; Motamedi and Shafiee, 1998; Tang et aL, 1998; Duitman et aL, 
1999). PKS-4 may be also involved in tbe activation and incorporation of hydroxy-benzoate 
but this latter domain lacks any ACP or PCP domain, suggesting that PKS-4 is responsible for 

5 initiation of the acyl-chain assembly (Figure 9B) onto one of the three ACP domains of Albl 

(ACPI, ACP2 or ACP3). The 277 amino-acids preceding the PKS-4 module in AlbVU may 
be necessary for the intercommunication between AlbVU and Albl. The presence of two 
different PKS modules potentially involved in the initiation of the acyl-chain assembly 
suggests a competition of these two modules for the initiation of two different albicidin 

10 polyketide-polypeptide backbones, and this could contribute to the production of multiple, 
structuraUy related albicidins by the same cluster XALBl. Production of two different 
components, one initiated by PKS^ containing an additional aromatic ring due to 
incorporation of hydroxybenzoate, may explain why partial characterization of albicidin 
indicated the presence of a variable number (three or four) of aromatic rings (Huang et aL, 

15 2001). 

In Albl, PKS-1 is followed by the PKS-2 module which contains a KS domain and a 
KR domain upstream from two ACP domains (ACP2 and ACP3) and it lacks any discemable 
AT domain. Tandem ACP domains are unusual within PKS modules but have been shown to 
occur in the biosynthesis of several fungal and bacterial polyketide synthases (Mayorga and 

20 Timberlake, 1992; Yu and Leonard, 1995; Takano et aL, 1995; Albertini et aL, 1995). 

However, the significance of the tandem ACP domains in these systems has not been solved 
yet. In our model 2, one of the tandem ACP (ACP2 or ACP3) may interact with PKS-4 for the 
initiation of an acyl-chain assembly (Figure 9B). The absence of an AT domain in the PKS-2 
module suggests that a separate AT domain is indispensable for the elongation of the acyl- 

25 chain initiated by this module. Separate AT enzymes encoded elsewhere in the genome were 
described in otibier systems for two PKS modules lacking AT domains: malonyl-CoA 
transacyclase gene (fehF) located immediately upstream from the subtilis PKS-NRPS mycA 
gene (Duitman et aL, 1999) and an AT gene located 20kb upstream from the M xanthus 
NRPS-PKS tal gene (Paitan et aL, 1999). We have not identified an AT gene in the gene 

30 cluster XALBl and in the two other genomic regions involved in albicidin production, 

XALB2 and XALB3, suggesting that the rraiw-acting AT gene may be encoded elsewhere in 
the genome. However, AlbXIU, which contains the motif GHSxG conserved in AT domains, 
may be potentially involved in the acyl transfer, but the sunilarity of AlbXIII with AT 
domains is not high enough to confirm this potential function of AlbXm CFigure 10). Figure 



52 Application of Royer. et al. 

lOA describes alignment of the conserved motife in AT domains finom Ri^*l, -2, -3, . 
Ri£E-l (Rifamycin PKSs, August et al, 1998) and BlmVm (Bleomycin PKS; Du et al, 
2000), identical amino acids are shown in bold. Figure lOB describes alignment of AlbXllI 
(SEQ ID N*. 38), FenF (a malonyl-CoA transacylase located upstream from mycA^ Duitman et 
a/., 1999) and lipA (a lipase; Valdez et al., 1999); amino acids identical to conserved AT 
domains motife are shown in bold. 

AlbXni contains only four of the eleven amino acids conserved in AT domains of 
rifemycin PKSs (August et al, 1998) and Bleomycin PKS (Du et aL, 2000), and the AlbXlll 
sequence appears to be more closely related to lipases such as LipA (Valdez et al,, 1999) 
rather than to AT domains (Figure 10). However, FenF, the /ra«5-acting AT domain involved 
in mycosubtilin biosynthesis, contains only seven of the eleven amino acids conserved in AT 
domains (Duitman et aL^ 1999; Figure 10). AlbVII, that contains a HBCL domain, may be 
another candidate for the acyl transfer in PKS-2 (Figure 9A) because HBCL exhibits some 
similarity with A domains at the level of cores Al, A2, A3, A4, A5 and A6 (Table 6). 
However, no HBCL involved in such a function has been described in the PKSs characterized 
so far. 

In Albl, PKS-2 is followed by the PKS-3 module which contains the KS2 and the 
PCPl domains and it lacks any discemable AT or A domain. PKS-3 is located upstream from 
the NBPS modules and should therefore be involved in the linkage of polyketide and 
polypeptide moieties. The presence of a PCP domain in the PKS-3 module suggests the 
involvement of a trans-acdag A domain rather than an AT domain. A putative candidate for 
this ^ara-acting A domain is the AlblV NRPS-5 A domain because of the lack of a C domain 
in the AlblV NRPS-5 module. However, by analogy with the BhnVm PKS module, which is 
involved in the linkage of polypeptide and polyketide moieties of bleomycin and which 
contains an AT domain followed by a PCP domain (Du et al, 2000), the presence of a PCP is 
not incompatible with a possible interaction of the Albl PKS-3 module with a separate AT 
domain. This latter /ran^-acting AT domain may be the same that interacts with the Albl 
PKS-2 module, the AlbVII PKS-4 module, AlbXm or an unindentified separate AT domain. 

In Albl, the PKS-3 module is followed by four NRPS modules. The NRPS-1, 2 and 3 
modules display the ordered C, A and PCP domains, suggesting that they are involved in the 
incorporation of three amino acid residues. The A domain of the NRPS-2 module exhibits 
poor consensus at A2, A3, A5, A7, A8 A9 and AlO motifs and lacks completely the A6 motif 
(Table 6). In addition the NRPS-2 substrate bindmg pocket is partially deleted (Figure 5). 
These features strongly suggest that the NRPS-2 A domain is inactive and that the loading of 
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an amino-acid on the NRPS-2 PCP domain (PCP3) is possibly catalyzed by a trans-scting A 
domain as in HMWPl (Gehring et al., 1998) and BlmlQ pu et al, 2000). A putative 
candidate for this frans-acting A domain is the NRPS-5 A domain present in AlblV because 
of the lack of a C domain in NRPS-5 (Figure 2). The additional sequence of 300 amino-acids 

5 present in the A domain of NRPS-S may be necessary for the intercommunication between 
Albl and AlblV. As a consequence of tiie interaction between NRPS-2 and NRPS-5, a 
competition between PCP-3 and PCP-5 domains must occur to bind the amino acid activated 
by the NRPS-5 A domain. A similar competition between two PCP domains was described 
for syringomycin biosynthesis, during the interaction between SyrB, which contains A and 

10 PCP domains, and the last module of SyrE which contains C and PCP domains (Guenzi et al., 
1998). The NRPS-4 module contains only a C domain which may transfer the intermediate 
products synthetized by Albl to a PCP domain present in an otiier albicidin synthase. Similar 
transfers were desoibed for mycosubtilin biosynthesis in which the MycA and MycB C- 
terminal C domains interact with the MycB and MycC N-terminal A domains, respectively 

15 (Duitman et aL, 1999). Two different PCP domains may be involved in the transfer of the 
intermediate products synthetized by Albl: the PCP-5 and PCP-6 domains which are present 
in the AlblV NRPS-5 and AlblX NRPS-6 modules, respectively. This possible competition 
between the two NRPS modules that contain two different A domains could also contribute to 
tiie production of multiple, structurally related albicidins by the gene cluster XALBl (Figure 

20 9B). Because of the absence of a C-domain in the AlblX NRPS-6 module, the intermediate 
product bound on the AlblV PCP-5 domain would be necessarily transferred to the AlblX 
PCP-7 domain, like the intermediate product bound on AlblX PCP-6. AlblX NRPS-7, which 
contains the single chain-terminating TE domain, may then be responsible for detachment of 
the mature albicidin polyketide-polypeptide backbone from the complex of enzymes. 

25 The linear model 1 implies that operon 1 and operon linX. Albilineam strain Xa23Rl 

from Florida potentially produce only one albicidin polyketide-polypetide backbone, with a 
competion at the level of ACP2/ACP3 and PCP3 and PCP5 which could explain the 
production by X. albiUneans of compounds stmcturally related to albicidin (Figure 9A), The 
linear model 2 implies that operon 1 and operon 2 in X. albiUneans strain Xa23Rl from 

30 Florida potentially produce four different albicidin polyketide-polypetide backbones (Figure 
9B) because of (i) the competition of AL and HBCL domains for initation of acyl chain 
assembly and (ii) the competition of AlblV NRPS-5 and AlblX NRPS-6 modules for the 
incorporation of the next to last amino acid of the albicidin backbone. These four albicidin 
backbones may lead to the production of four components structurally very different. The 
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pol^edde moieties of fhs acyl chains initiated by the Albl AL domain or by the Alb VII 
HBCL domain may be very diflferent The polyketide moiety of acyl chains initiated by the 
AlbVn HBCL domain may be shorter and noay contain an additional aromatic ring. The 
presence of four stnicturaly different metabolites may explain the difi5culty observed by Birch 
and Patil (198Sa) to purify albicidin and to detmnine its chemical structure. 

Homology analysis also revealed that Albl NRPS-I and 3 and AlbDC NRPS-6 and 7 
specify unusual substrates which seem to contain an amino group and a caifooxylate group but 
to be different from a-amino acids and ^-alanine. Identification of several aromatic rings in 
albicidin (Huang et al 2001) suggested that NRPS-1, -3, -6 and -7 are involved in 
incorporation of aromatic substrates. By analogy with the Asp235Val alteration in the P-Ala 
specificity-conferring code (Du ei al. 2000), the Asp235Ala alteration in ttie NRPS-1, -3, -6 
and -7 signatures could be consistent with a large distance between the amino group and the 
carboxylate group in die substrate specified by these modules. Based on this hypothesis, we 
suggest that operons 3, 4 and S are involved in the biosynthesis of two aromatic substrates: the 
paia-aminobenzoate potentially synthesized by AlbXVn (para-aminobenzoate synthase), and 
the carbamoyl benzoate potentially synthesized by AlbXX (hydroxybenzoate synthase) and 
AlbXV (carbamoyl transferase). Incorporation of these nonproteinogenic substrates may 
explain why albicidm is insensitive to proteases (Birch and Patil, 1985a). 

According to biosynthesis model 1 leading to the biosynthesis of only one polyketide- 
polypeptide albicidin backbone that may correspond to ttie major component produced by 
XAlbl, we propose a model allowing prediction of the con:q[)osition and the structure of 
albicidin (Figure 11). In the Figure, NRPS and PKS domains are abbreviated as follows: A, 
adenylation; ACP, acyl carrier protein; AL, acyl-CoA ligase; C, condensation; KR, 
ketoreductase; KS, ketoacyl synthase; PCP, peptidyl carrier protein. C atoms of albicidm- 
backbone are numbered 1 to 38. Bold methyl groups correspond to methylation of the 
albicidin backbone by Albll or AlbVL In this model, albicidin biosynthesis is mitiated by 
loading of an acetyl-CoA by PKS-1 (step 1), and the chain product is elongated by 
incoiporation of CO malonyl-CoA by PKS-2 and PKS-3 (steps 2 and 3), (ii) para- 
aminobenzoate or carbamoyl benzoate by NRPS-1 and NRPS-3 (steps 4 and 6), (iii) 
asparagine by NRPS-2 coupled to NRPS-5 (step 5) and (iv) para-aminobenzoate or carbamoyl 
benzoate by NRPS-6 and NRPS-7 (steps 7 and 8). The presence of the KR domain in the 
PKS-2 module may lead to the formation of an hydroxyl group at the atom of the albicidin 
backbone. This hydroxyl group might be methylated by Alb VI (O-methyltransferase). The 
acyl chain may also be modified by Albll (C-methyltransferase) at or C,4. 
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The chemical composition (C^oOijNfiHjs), the molecular weight (839), and the 
structure of flie putative XALBl product are in accordance with the partial characterization 
of albicidin published by Birch and Patil (1985a) which indicated that albicidin contains 
approximately 38 carbon atoms and a carboxylate group and that the molecular weight of 
albicidin was about 842. The presence of two ester linkages in our predicted albicidin 
structure is also in accordance with the fieict that albicidin is detoxified by the AlbD esterase 
(Zhang and Birch, 1997), However, an tmpublished albicidin analysis cited by Huang et aL 
(2001) indicated the presence of (J) two OCH3 groups and not one as in our predictive 
albicidin structure, (ii) one CN linkage and not eleven as in our predictive albicidin structure 
and (iii) a trisubstituted double bond that is not present in the putative XALBl product. 
Further investigations to identify the substrate of modules NRPS-1, 3, 6 and 7 and to 
characterize the structure of albicidin are necessary to valid our model for albicidin 
biosynthesis. 

In conclusion, homology analysis of XALBl revealed unprecedented features for 
hybrid pol^cetide-peptide biosynthesis in bacteria involving a /raw5-action of four PKS and 
seven NRPS separate modules which could contribute to the production of multiple, 
structurally related polyketide-peptide compounds by the same gene cluster. Charactenzation 
of the full chemical structure of albicidin may be necessary to validate these models. Four 
NRPS modules seem to activate a very unusual substrate. Over- expression and purification of 
A domains from these four NRPS modules will be necessary to examine their substrate 
specificities. Substrate specificity of each A domain will therefore be determinated by 
analysis of the ATP-PPi exchange reaction with different substrate putatively incorporated 
into albicidm. Investigating albicidin backbone biosynthesis will be of great interest because 
such information adds to the limited knowledge as to how PKS and NRPS interact and how 
they might be manipulated to engineer novel molecules, and may explain how X. albilineans 
produces several structurally related, toxic compounds. 

Cloning and sequencing of XALB2 showed that the same phosphopantethemyl 
transferase is required for albicidin production in an X. albilineans strain from Florida and in 
an Jt albilineans strain from Australia (Huang et aL, 2000b), explaining the precedented 
results showmg that strain LS156 mutated in xabA (100% identical to albXXI) was not 
complemented by pALB540, pALB571 and pALB639 (Rett et aL, 1996). Mutant LSI 56 was 
shown to be complemented by a construction containing the coding sequence of xabA in 
fusion with tocZ, revealing that TcabA is required for albicidin production and that no other 
cistron downstream from xabA was involved in albicidin production (Huang et aL, 2000b). 
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However, this con^lementation study did not allow determination of whether xabA is 
transcribed as a part of a larger operon. Here we disclose the complementation of mutant 
AM37 with a 2986 bp insert from AT. albilineans containing albXXI (100% identical to xabA\ 
confirming that albXXI is involved in albicidin biosynthesis and indicating that the promoter 
5 of albXXI is present in the 2986 bp insert and that albXXI is not expressed as part of a operon. 

Cloning and sequencing of XALB3 showed that a heat shock protein HtpG was 
involved in albicidin production in X albilineans. The heat shock protein H^G is an 
Escherichia coli homologue of eukaiyotic HSP90 molecular ch^erone. Hsp90 from 
eukaryotes has been demonstrated to possess chaperone activity (Jakob et aLy 1995), acting as 

10 a non-ATP dependent *holder\ and it also has an important role in signal transduction and the 

cell cycle. This protein is essential in both drosophila and yeast (Borkovich et al., 1989; 
Cutforth and Rubin, 1994). In contrast, the HtpG gene can be deleted in E, coli with no effect 
on the viability of the strain witti the exception of decreased growth rate at high temperatures 
(Bardweil and Ciaig, 1988). The in vivo role of the HtpG protein remains unknown. 

15 However, preliminary results indicated that HtpG facilitates de novo protem folding in 

stressed E, coli cells, presumably by expanding the ability of the DnaK-DnaJ-GrpE molecular 
chaperone system to interact with newly synthesized polypeptides (Thomas and Baneyx, 
2000). Furthermore, HtpG was copurified in E, coli with MccB17 synthetase, an enzyme 
involved in the biosynthesis of Ae peptide antibiotic microcin Bl 7 which inhibits DNA 

20 replication by mduction of the SOS repair system, suggesting the requirement of HtpG for 
production of the antibiotic (li et aL, 1996). However, when microcin B17 production by the 
E, coli strain deleted for HtpG was compared to the one of the parental strain, there was no 
effect on microcin B17 production in vivo. This result implyed that the copurification of HtpG 
with the MccB17 synthetase was potentially an artefact, or that another E, coli chaperone 

25 could substitute for H^G (Mihie et aL, 1999). To examine the effect of HtpG on the 
reconstitution of MccB17 synthetase in vitro, the chaperone was expressed and purified as a 
fusion to a hexahistidine (Hisg) tag. Addition of the Hisg-HtpG did not stimulate MccB17 
synthetase reconstitution or heteroQrclisadon activity in vitro, suggesting that H^G mediates 
complex assembly or stabilizes protein subunits prior to the hetero-oligomerisation (Milne ef 

30 al. , 1999). Based on these results, we suggest that the function of AlbXXU is to mediate 
complex assembly by facilitating de novo protein folding of PKS and NRPS enzymes (Albl, 
AlblV, AlbVII and AlblX) involved in the albicidin backbone biosynthesis. 

Characterization of the complete sequence of XALBl, XALB2 and XALB3 clusters 
enables one to characterize all enzymes of the albicidin biosynthesis pathway including 



• 
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Structural, resistance, secretory and regulatory elements, and to engineer overproduction of 
albicidin- For example one may insert expression enhancing DNA into the genome of X, 
alhilineans in a position operable to enhance expression of the Albicidins Biosynthesis Gene 
Clusters. One may also modify naturally occurring Albicidins to obtain additional non- 
5 naturally occurring antibiotics by adding DNA encoding additional enzymes selected to 
produce a modified albicidin like molecule. This approach will allow (I) the purification of 
albicidin and the other compounds structurally related and potentially produced by the same 
biosynthesis apparatus; (ii) the characterization of chemical structiu-e of albicidin; (iii) the 
investigation of mode of action of albicidin in the pathogenesis of X. albilineans in sugarcane; 

10 and (iv) the characterization of the bactericidal activity of albicidin. For example one may 
also increase the resistance of plants to damage firom X, albilineans infection by inserting one 
or more of the resistance genes identified herem into the genome of the plant. One may also 
provide materials to prevent damage by albicidin produced by X. albilineans by applying an 
agent that blocks expression of tiie Albicidin Biosynthesis Gene Clusters to the plant to be 

15 ^ protected. One may also use portions of the DNA of the Albicidin Biosynthesis Gene 
Clusters to obtain agents useful in blocking expression of albicidin by screening materials 
against a modified hast cell line that expresses the Albicidin Biosynthesis Gene Clusters and 
selecting for materials that stop or decrease albicidin production. 
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SEQXTEiTCE LISTING 

<110> University of Florida 
Royer « Monigue 

<120> A BIOSYNTHETIC GENE CLUSTER FOR ALBICIDIN, THE ANITBIOTIC AIiBICIDIN, 
RESISTANCE GENES, Mm USES THEREOF 

<130> 79-1 

<160> 54 

<170> Fatentin version 3.1 

<210> 1 
<211> 55839 
<212> DNA 

<213> Xanthomonas albilineans 
<400> 1 

gaattctttc gccattgccc gggattgatg actggcatcg ggattgtcgg gaccttctct 60 

ggcttgttgc tcggtctgca tcagttcgcg gatgctatcg cgccgaaggc caccgcgttg 120 

gctactccgg tgcggcagat tggagcgacg ggacatgcac ccgctagcgc ggccagtgcg 180 

ccgatgactg gcggggtgtt cgtctctttt tccgcaacgc cggtgcgatt gcccgcggta 240 

acaggcatga aggttctcaa tgagggccat gacggcatgg cgggcgcggc tatcactcat 300 

gcggcggtaa aaccggtgat ggctccattg acaggggaca tgttgacagg cgcgttgagc 360 

tgcctgctgc acagcgtgtc gggcgctttc ctcgtttcgg ccattgcggt gttgtgcgcg 420 

cttttgacca cgttcgtcga aaagbtacta gtggcgcaat gctttcatca gcttcaggaa 480 

ctgtcgtcga cgattgaccg actgttcgca tttaatcctg gcgacgatcc gabgatgcgt 540 

ttgacgttga cgtcggaatc gacagagcgg ctaatcaagc ggatcgcgct tgcattagac 600 

gacattgcaa tttcgcgaga tcaagttcag tgagatttta tcgatacgga tttcctgtgc 660 



Application of Roycr, et a] 

ggcgtgacgC cgagcgtggc caggaagaaa gccccttctg gatttcgctg tcagacatga 720 

tgaccggcat gaccgcgctt tttctggtgc tggcctgcag catgctgcta atgcgcatca 780 
ccgaatcgaa atcgcccagt ggaacgtcgc ccagcgacga gaaagtcgtt gataccgccg 840 
cgccgacgca gcgcaacgag cgttccgtgg gtgcggcagt gaacgcttcg catgatgcga 900 
accgacttgc tacctacgaa tcggatattt ccaccgtgtt gaaaaatata tcggcgttat 960 

cgcagaaata cggattttcg gtcgatgcta ccactaacac gatcgatctc ggccagtcgg 1020 

ggctattccc gcttggaagc gaccgcttga gcgcgacaca ggaaaggttg ctgcgcaatt 1080 

atgtcgccga tctcatcgca ctgactcaga acgatccggc catggcacca ttgaagagca 1140 

tcaccgtggb cggctataca gatccggctg ggtcgtatct gttcaatctc gatctgagtg 1200 

cgcgccgcag cgaacgcctg atgtgcgtgc tacttgccac gctggaaaaa cagagtagca 1260 

cgacaggtgc accgacgatg accgaggact cgctgcagac catccggcac ctgttcaggg 1320 

ttggcggcta ttccgccgac gcacagaagg aaagtgccgc agcaagcagg cgattagcgt 1380 

tgaagctgga tttttacaag atcgacgagc cccggcggca agccgctgtg cttgcgatgc 1440 

cggtcgggtc gtgcgcgctt ggatcgcgtt agggaaggtc tgaataaccc cctccccagt 1500 

aatgcaattt attgatttta taggtgtgaa taagagcgga taacaaaaca tagcgagcga 1560 

ctctcaggcg taccgcgcag cacgcaggaa ccggtgtgta cgagtggtac ataaggaatc 1620 

cgagcacagc acgcgcccgt ctggcggaca cgcagtagtt ttgttagctg ctctaagcgc 1680 

aggatgaggt taatcagagc atcactaaca ggaatgatag tttggaaacg ataacaagta 1740 

ttgcgccgtg cgcgctgccg gatgtctggc gtcatgccgc cgaaggcaag accgtcgcct 1800 

cgttcgactg cttcgatacg ctgctgtggc gtggcgtgag cgcgcogagc caagtttttt 1860 
gtgagttggc cggcaagccg ccgttttcga cgcaaggcta taccgcctac catcgcgtga 1920 





^ lS» ^ ^. 
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tggccgaatc gttggcgcgt: cgtgtccggt ttgaacagaa agaggttcca gaagtcacgc 1980 

tgcgcgaggt gtatcagcag ttattcccgg agcctgacga tgcaacccgg ataacggcat 2040 

gcatggcgtg cgaacaggaa gctgaggcat cgatctgctt cgtcgtgcca gctgtgattg 2100 

cctgcatgcg cgaagccagg cgcttgggga tgaccattat cgtcgtcagc gatacctatt 2160 

ttaacgctgg ccagcbgcgt gcactcattig cgtcagtgtc gcccgaagcc gacgagctgg 2220 

tcgaccggat cttctgctct tccgactatc ggaccatgaa gaagtaccag ctatggcacc 2280 

gcgtgctcga cgaactgtac gcagcacctg aaacaatcgt gcatttaggc gacaatcgcg 2340 

tcgccgacat actaatgccg tcgaagctgg gcatcgcgtg cctgtggcta gatcggtatg 2400 

ccggtgcagc catgaccgtg ttgcgccgac gcgaatgcgc gacccgattg atattcagcg 2460 

gcgttgaaga gactcggtcc gtgtggacac tttacgacgg cctcgactgt cgaacgcaag 2520 

ccgagcgttc gaactggcat gatgaactcg gctggcactt tcttggaccg gtggtctttg 2580 

cttttgcgaa aacgctggcc gacgaatttg ccggccaaac acacggaatt gatcagccga .2640 

cggttcgctt cgggtttctg atgcgcgatg ctcatctgtt acgcgaagct gcggatatcg 2700 

tcgcacctca tgagccgcat ccgtcattgc acatcagtcg aaagaccgcg ttctctgcgt 2760 

ccttcgatag tgacgatgcc attctgcatt tcgtcaagtt gggcaggctc gaatatcggt 2820 

tgagtgccgc gcaatgtggc gtctgtctac ttctgaacga ggtagagaaa gcgcgtctgg 2880 

cggaagcgtt catggcaaaa gaagacacag caggtgcaat gcgggaattt ttttctccgc * 294 0 
agatgttgga cgctatcaag acgcgttcga aagcgttccg tgccaggctc atcaggcaca 3000 
tcgtcgcgca gacaggactc aagcgcggcg acacattatg tctcgtcgat acaggttata 3060 
atggaacgat tcaatacttg ctgcatgatg tattgcgtaa ggaaatgaac gtcaccgtca 3120 
tcggacggta tct:aat:ctac cggcagaacc tcagactgca cggaagggca tcgggcctca 3180 



g9 Application of Royer, et at 

tcgacgaatc gtggctcgac cacggactga ttcatacgct ctcgcagtcg ggtctgtcgt 3240 

atctagaggc gctatgcgcc ggtgctggcg gctgcgttgt cgactatgcg cagaatgggc 3300 

gctccgtctg caatgcggag atggtctgcc gctcgccatg gatagacgca tgccagcgta 3360 

tggcgctgat gttcgtcaat caggtttgcg cctcatccgc gtccgagctg ccgaagctga 3420 

cgcgtagccg tttacgcgag tccgcattga cgaacatcag cgccatgctg tttttcccca 3480 

gcgagtgcga attgtccgag atcagtcaga tgcaggggga agtgaatctc ggcgccgata 3540 

tctgccacag cctgtgcgat ccagaaaaag gcctgagtgg attgcggcgg cgggggctgc 3600 

tttatatgac tggcggtggc gagtttcgga gcaattggcc ttttgagctg cgttatgcgg 3660 

gcgcagatca gctggccctt tacatggcgt gctttcgcaa cggtttgcgg atgaccgagc 3720 

ctgcgttttc ttaccgccac attacattgc cgctgacatt cgcatttgaa gcagacgtcg 3780 

cgcgtcgatg cgtgcccgca catgcgacat atgabggata ttacactgcc gtttttccac 3840 

ttagccaagg taaggtgtac gttgaaatcg gcgctgtcgc acagtggttg caaatcgaat 3900 

cggtgaccag ctttccagct gaggcgtacg gctatatatc tgagtccagt gggccacaac 3 960 

tgcgcttcga gtcaaccgat tatatatttg atgacgtgac gcgccatggt aacggactgc 4020 

tggagtgcga tcatgccgcc gtaatgacgt ttgcgccgat tccggaagga acgtcggatt 4080 

acatccggct ggtattccgc cctataaccc ttcgcgacga cccacagcta actgcgtggc 4140 

ccgccgacca gcaactcgtt gccgtcaacg gtgtggatag ccactgcggc ggcaaagaca 4200 

agtcagcgcc gatacccagc tatgcggcgc ttatcgacgc aaaactggct gcgatcggat 4260 

cggttctgaa caccgcggtic tataccggat tgtaatgagc aagggaaaat actttctgag 4320 

gatttaatat ggatggtgca gtatttcacg gctaccacac cggaggtatc tacaatgcag 43.80 

gcaagtttga ggagcggcgt tatacgacca taggtattga tggatacaat aggaggctcg 4440 
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cgggtgagtt ggagtggacc aacaacgttt atccgcgtcc ctatggccct tccgttgacg 4500 

attgggtgga tgaacgcgat agggtgccaa aagagcggcg tgtcaaagac gattatcgcc 4560 

atccccggCt ttttgccaaa cccaccacga gcgcggcagt ggaggagtgc aggccggctt 4620 

cgggcgtcgt gcagcaggcg cggcggctgg acaatgcgag cgcgagccct cttttcgctg 4680 

caagcgcggt gcgacgagga atccggtcgt gtgtcccgtc gaatctgcca ccgagtgtcg 4740 

ccgtgggcca taggctagca ggaccattgg catctgagta tattcaaaac tcacgcttgc 4800 

tcgtcaatcg gcctgcaagt acgtttgtaa cgttagggag tcgtgttacg gcagggcatg 4860 

IS ccctatcgga catcgccagc ggtgcagacg cgaccaaccg cccgcgcaac gtgaaaggca 4920 

aacatgaaaa agacgacatg acccaacggg gccaacataa tgcaggcggt cttgtggcga 4980 

gaatcaaggc gaatattcgg aattatatcg atgcaaagtt acgccggtat cctgcaagtc 5040 

20 

ggacttgagc tgataaatga tgaacgcggt gattcctgta attaaaattc ttaccactga 5100 

tgtcactggg ataaatcttt cccaccctcc cactttcgca cagaacggga agactttgaa 5160 

25 tagactctta gaactcctgc gcctaccact tcaaacggag ttttcgactg cgcgagatgc 5220 

taccgcgtct ggctcgcgcc atgatgctgc accaagctca tgcgcagccg gtcttgcgga 5280 

tggccagtaa ttttcatagc cgagaatatt tactaataaa taacataata gcactatctc 5340 

30 

atcgccatca tcagaataga tgatcgattt tatttattaa caaggtaatt atatatgttg 5400 

caatacgttg gagaaatttg tgccccacat aatttaagtt atggaaaaaa tatagattcc 5460 

35 tcaagaaata gaatagagat tgaaggtaat gtaaaaattc ccaggcctaa atitatgcaat 5520 

gaaattgcag ccgttttcag cgcagatcag gccataaatg caaatatgaa aattttgcaa 5580 

aaatgcgatt ttggacaatc tacgtctgat acttttagaa aaaaatcttt ttttaaccat 5640 

gctagtaatc tagtggggaa aacaatacat ccccaagttg cagatcgctt tcgctcggaa 5700 
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gtttcctttg atggaacggg taaaataatt ggattcaaaa aggaaatttc aaataataaa 5760 

tttgattttc taataaaaaa ttgcattcat tctattaatg agactccgca tttaatagaa 5820 

aataagatta gaacccctgg ggagtgctta ataaatatbg ccatctctct tgacagtgtc 5880 

atgccaccaa tcccgcttaa ggabgaaatt caagcgacta taagatatac ttggcttgat 5940 

ggtggcttgt cgcgtcaagc ccttgtaaat ttaatcgagc titatcaagtt gagaaccagg 6000 

tttaatgcgc ctgaattata tccaaaaata agaaggaaaa ttgacgatat tgctggaact 6060 

ttggagaaag aaaccttcca gacgaaacgg aatgacaatt bctatggtcg cctttttgaa 6120 

IS ' acgcatttat cagaatattt aataaaaaat cctgattatg cgatgatgga ggtccgtgat 6180 

gcagtggctg catttatttt gcaagacttc atatctccac ttttaacgtg caaagggaat 6240 

gaagaatcgc aatctgcaat ttgtgaaagg ctagcttctc ttatggaaga tgaccctcct 6300 

20 

tcttggcggt gcaatabtga abcagbbaag aaabbbcbbg ccbcbaagag bcaagctgac 6360 

bbtabagaaa bgabgaaaba bcabggcgaa bbbbcbgbgc cabbgabbcb bbcgabagca 6420 

25 gbgaaababa bbacbabcgc gccaggabbg caagagcbba aaaabaaagc gagbgaabbb 6480 

bababbgaaa aaabcabtcc gcaacgcaag cbtcgcaacc bbabattaag baabaatgcb 6540 

cabaabaaaa aatccaaccb bbabggbbbg abgcbbccab abcaacgbgg agcaaabgca 6600 

30 

ggcbabbcca bgagbggcgg gabcggcabc aggccbabbg abcgbbabgc acbcccbggc 6660 

gbggaagabg gbatgcabga ggabgabcbg gbagcgbcab ccaabgagab aaccabbgcg 6720 

35 acbggagbbt cbggcbcatc aaababtcbb aabbbtcbbb bcaabaaaab bcgaaaaaca 6780 

bcaacgaabb tbccbabgga bgccgggaga bbggccgbgg ccbcbbggcb bgcbbabagb 6840 

ggagggcacb cbbbbaabga agcbbabbcb gbabbcbcgb acaaaaacgc agggaaabbb 6900 

aaabcaabcb cbbbbaagag tbbagcbgaa gacbabgabb bgabgaabaa aggbgbggag 6960 
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catgcataca atcaagtact acagacagcc aaacgtttac agaaaccggg gtccgggcaa 7020 

ctgccccccc atgtcagagt gtcagcctcc cgtaaaattg atccagccat agctagaggt 7080 

ctccgtgcgc actagccacc gaggcgacca acaatgcgca agagcaagtt caccgagagc 7140 

cagatcgtcg ccacgctgaa gcaggtggag ggcggtcgcc aggtcaagga tgtatgccgt 7200 

gagctgggca tttccgaggc gacgtacttg ctcttccact ggtaataggt cgccgtgctg 7260 

atgccgactt ggcgacagat gtctttgact ggaacgcctg cgtcggcctg cttgagcgtg 7320 

gcgatgatct gtgtctcgtt gaacttcgat gtgcgcatgg aacctcctga cttgggaacg 7380 

atgccagaaa gatctactta tgcggtgtct gcggatcggg gjgagcttacg catccagcct 7440 

gggtaagtcg cagcagaaga gaagcctaca tcgagaacag tacgttatgc ctagtggttt 7500 

ggttcggccc ttgtttgaca cgatcagccg ctaggaaccg ccgccatctg ggttcgccgc 7560 

ataagcctac ttttcgatca gtacgtcatc gatgaacatc gcgtccagac cgctggtgaa 7620 

gaagatcgac agcgcatcct ccactgaatg ggcgatcggt ttgcccatga cgttgaagct 7680 

25 ggtgttaagc accaggggaa tacccgtcag gcggtagaat tctttgatca gggcgtgata 7740 

gcgcgggttc caatgttgct tcaccgtctg cagacgtccg gtgccgtcgt ggtgcacgac 7800 

gcccggcacc ttgcgcgtgg cttccgcacg gaacttcagg gtgcgcfccca tgtagggcga 7860 

ttcctggtac agctcgaaat actccgcgcc gtgctcatgc aaaatcgacg gtgcgaacgg 7920 

gcggaactcc tcgcggaact tcacccgcgc attgatgatg tccttgatcg caggcgaacg 7980 

cggatctgca aggatcgagc gattgcctag ggcccgtggc ccgaattccg cgcggccttg 8040 

cacccaggcg acgatcttac cctcggbcag cagccgggcc gcgcgttgtg ctgcgtcgtc 8100 

gaggcaacga gtgaatttgg atagcgcgcc gaagcgctcc acgttatgca aggtctccgc 8160 

actcatgctg ctgcccaggt agggcgattg ttcgcgcgca gccggcggtg tctgctcagg 8220 
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atggtcctcg gcgtgtgccc ataatgcggc gcccaccgcg ttaccgtcat cgccaggggc 8280 

ggcgaatacg tgcagatgac ggaacggagt ttcagccagc acgcggccgt tagccgagga 8340 

attgagtgca cagccgccgc ccagcaccaa gtggtcggac aagcccaaag cgtgcaggtt 8400 

gtgcaggaat tcgaagagga cgtcgcagaa cacctgcbgg ccggcatagg ccaggttggc 8460 

caattcgatc gttggctggc ccttgcatcg gcgcattgca tacagcgtgc gctgcaactg 8520 

gctgaattgt gctgcgggcg caaacctcag cgttaggccg tcgacgcgta gcatctggcg 8580 

caacaactcg tacagttgcc gatcatgttg cccgtaggcg gccaggccca tcaccttcca 8640 

15 ttcttcgccg gacagggtgc cgaagccgca aacctcgcag atcataccgt agaagaagcc 8700 

caggctggcc caactgctgg tctcgctttg gtggatcggc gtaagcttgc cctgttggta 8760 

gtggtagcag gccaaagcat ttttttcacc catgccgtcc agtactgcgc acaccgcctc 8820 

20 

ctcgaacggg ctggtgtagc agccggccac cgcgtgggtt aggtggtgct cgtaatgacg 8880 

gtagctgggt ggcttgaagg caggbtcggc catgtggctc aagtcatatt cgagcaggtg 8940 

25 tccggggtgc tccaccatcg ccagctgcga acggtagaag aagctctgtg ccacgaattg 9000 

cttgctgacg tgccaaggca ggtcgccgaa ggcgctgcgg tattggtcta ccgcttgcgc 9060 

ggtcttgccc aggccctccc gcatcagttc aggtgtttgc ccgctccaac tagtagcgac 9120 

30 

gaccagttcg gcgccgggat cgccgtattc gtggaccagc ttgatggcgc gctgaaacac 9180 

gtccggggca acgccgattg aacgcttgta ctgcaggtag cgctcggtgg cctcggcaaa 9240 

35 gcgcacctga ccatcgtcgc cgacgatagc gatggctgaa tcgtggaagg aattggcgag 9300 

tccgatgtaa gtgcgcttca tgtccgattc cagtgaaggg gacgatggat gcttccggat 9360 

gcctacggcg atgattgtgg cgcaaattigt gtcagtttga actgcaatcc cagcgtagag 9420 

agcgccacga aagcattggc cgcgatgtag tacaccaccg tagccccgaa ggccbgcttg 9480 
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aaagccagag ctacgcctgc cggcccctgc agatgctggt gcaggccgct aaagaaaatt 9540 

tccgagacca gggcgatgcc cagcataccg ccgacctgct ggatgacctg cagcgcgccg 9600 

gaacctgcgc cggcatcctt cagaggtacc gtacgcatca ctgtctggaa tagcgaggcg 9660 

atggtgatgc cacagcccag tccgccgatc agcaacggca gggtaagcgt ccagggatcc 9720 

agcgagcctt cactgcgcgt gatgatgacc cacaaggcca gacagctagc gatcatcaga 9780 

caggcgccgc tgaagatttt cgcgcgtagg ctttcgacgt gccgagcgag catagaggca 9840 

atcgccacgc cgacagggaa aggagtagtg gcgacgccgg tttccagtgc cgaatacgcc 9900 

agtccttgct gcagaaagat cacgaacacc aggaaaaaac cctgcagcgc cgaatagaac 9960 

accgacacgg acaaggcgcc caagatgtag tcgcgatggc tcatcaggta gatcggcagc 10020 

agggccgggc gcgccaagtg ggcttgccga cgttgccagg cgacgaaggc caccagcagc 10080 

ggaataccga gcgcaatggc tgcaaagcac catagcggcc agccgtatgc gcgtccttct 10140 

attagtggga acaccaggca caacaaggcg agcgcggcca gggcgatgcc gacccagtcg 10200 

ttatggatgc ccgcatgcgc cggcaccttg ggcacccaga tggcggccgc cagcaaggtc 10260 

acgaggccga tcggcacgtt gatcaggaag atcgcgcgcc agccgacgcc gaacgcatcg 10320 

atgtggatca gcaagccgct: gacgaggggg ccggcgaatg aggccaggcc cgcgaccagg 10380 

ccgaacaacg agaaggcggc cgcgcgctcc ttcggagcga acatggtttg cgcgatggcc 10440 

atcacctgtg gtgccagcat ggctgcggcc aagccctgca aagcgcgcgc gatgatgagc 10500 

acgtggatat tgccagcgat ggcgcagaac gcggacatca agataaaacc ggccacgccc 10560 

gtgccgaaca tgcgcttgcg gccgagcatg Ccacccaacc gccccaacgg cagcaacccc 10620 

aacgcaaaca gcagaatgta tatcgctacg atccattcca gctgttgctc gtccgcgccc 10680 

aggttcttct ggatactggg cagggcgaca ttgacgatgc ctacgtccag caagttcatg 10740 
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aaattggcgc tcagcaacac gatcatcgct ggccagcgcc atcggtagtc gaactgcgct 10800 
ccgggtggtg ccatgccggg cggcataccc agcgcttcct tgggtttttg catgttgtgt 10860 
gctccttatc cgcttatggc cgcttcagcc ggtcgcatgg tgacggtgag aaaatgcaag 10920 
atgtcgcgct ccaactcgcg ctggaaggcg ctgcggtcga agccaggcgg atcgaCggcc 10980 
gcctcgccca cgcgagcttt catcgcctcg gggaatacgc tgatgaaggc gtagtggccg 11040 
gcattgggca ccacgcgcgc ttccagtcga ccatcattgc ctagcgccgt gcgtgtcgcc lllOO 
acaatcgttt cgtgcgccca ttgatccttt tcaccgacga tgagcagcac cggtacctcg 11160 
actttcgcca gggcatcctc gtgcatgtac aggctgaaat ccggcgcaag cgccaccacg 11220 
gcgcgcacgc gcggatcagc tgtgaccggc acggccctga tcggtacccg attttgtcgc 11280 
accagcgcgg tccaggcggg ttgttcggcg tgttccgggc gatgcgcaaa atcgaccatg 11340 
aaaccggtat gcggttcgcc cccggcgatc gctaaggcgg tgbagccgcc gacggagtgg 11400 
ccgatcaccg ctacgttatg ggcctgaatg gcaggaccga actgcgcatg gccggtgagc 
gtatcgatca ccgcgcggat gtgccggggg cggtcttcca gattctgata gctgtattcc 
agctgatgct ggaacaggtt gtcgcccgga tgctccggca aggcgacgat aaagccgtgc 11580 
cgtgctaggt aatgtgccag cgtgcgaaac actaggccgg cgctgcgcgt gccgtgcgag 11640 
atcaccgcta gcggaaacgg gccggcttcg atcggcgcgc ccagggccac gtccagcgta 11700 
taaggtccca tcgccgtatc ccgtgaaggc gtggcggtgg gatacatcac ccacatcggc 11760 
accaccctgc tggcatcacc atcggtttcc agtttttggc aacccacata gctattcatg 11820 
cgtaccccaa cgaaagaata aaaaacgcgt gccgcttacg ccggcatcag cttatccaga 11980 
cttgcaccgg cgggcgccag ccaaacagcg gtacgacctt cgcccagctc cctgcccatc 11940 
aagcctcgca cgccggccag atcctgcggc gttggcagca acgcggccag ccgttggccg 12000 



11460 
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tcggcatcgt gcacggggrtt gccatctaag tcgaaggcaa gccccttgca cgggccgaaa 12060 

tcgcggtgga aacgctcgtg cggcagctgt agctcaaaag ccaggccaag gcgcctgagc 12120 

tgttgattcc agcggccgat gatagcgccg acttcggcta tgtattgtcg gcgcattacc 12180 

gcattgattg ccaactcggc aggtgcggtg gacgagacca gtctgtcttc gcagcgcacg 12240 

tcgacggcga cctccgtacc ttcgagcttg tcgaagttgc gtgggctgcg gatgccggcc 12300 

tggtacagga cgcgtgaacg ctccgaaacg tcgtggccga acagatcaaa gatcttcggt 12360 

agccaatagt taaggtattt ctgaacgact ggcaagggga tcgcaccggc gtcgaagatc 12420 

gcgtgcgtat cctcacgcaa ggtaatctcg gcgcttcgat acagcacgcg ctccaggcca 12480 

tccacgccga acttaatatg cagaggttcc tcgaacatca tgaagcgggc ggtgcgtgcc 12540 

aacggcagga aagctgactg ggtcaccgct tgaatctggt acttgcctac ccggtcggcg 12600 

aagaagcacc acatgaaatg cgatagccag tcttcggtgt ggtagttgaa ggcatcgagc 12660 

aggcgcgggt tctgcgcatc gccactcatg cgttgcagca ggccctcggc ggcgtcggct 12720 

ccgtcgctgc caaagtattc gatcagcaga tgcgacatcg cccaggtgtg ccggccttcc 12780 

tctagaaaaa actggaataa gtgctccagg tcgatcgcgc tgggcaccat ttgcgtcagc 12840 

tcgtggctct gctcaaccgc ggcgttttct: acgtcacctt gcacggtgac atgatccagc 12900 

agcaggtcac gatattcttc cggcacgcac gaccatgcca cctgtccctt gcgctcgccg 12960 

aacactacgg tgttgcggtc tggcggcatc atgaatacgc cccagcggta gtcgctgggg 13020 

cgcatgcggt gatagcgcgt ccactcgctg cccttgacgc caccggtagg catgcgcagg 13080 

ttcatctcgc gatcgtggaa ttcgctgggg ccgcggaggc gccaccattg caggaagcga 1314 0 

acttggtagg aagtcagttg cctgaccagc gcgctgtgtg gatcaaggtc gatattgtta 13200 

ggaatgtaca tgagggtcag gccgcgctgt gcatcgaggt aggggcgact tcgccgaccg 13260 
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tgagcccgat gaagtcgatg atccgctgag ccacggcacc cggatcattg aacagttgca 13320 
ggtgaccgcc atgatctagc aggtccagac gcaaagacgg gtcgtgccta gccaactgca 13380 
ccgaggcgga gtagtggctg aagctgtcgt ccttgcagtg cacgatcagg gtagggtgcc 13440 
tgcccagggc ggtgggcagc aaggcttgca cggattgctt gttctcttcg taggcacgca 13500 
tgtagcgcga gaacaccaat gtgctcgctg gatcggctag gtgcagcatg gtgagctttt 13560 
cggccaagtc gtcgccgcgt aaaggttiggc cgcggtactt gtccaggatg gcggcgagct 13620 
ttttggcctg ttccagaccg tgccgctcga tctgaaggta gatcggcaag gcgcaacgtt 13680 
cgaattcgga ttttacgatg ggcggcagca ggcccgccgg cgccacccaa gccatgctgc 13740 
gtggtgcgaa gccatgcagc gcaatggcat gcacggccaa ttgtgcagcc tgacaccaac 13800 
cgacaaaatg gcaatcggcg tagtcgtgtt ggtgcaggat gcccagcagg gtcgcggctt 13860 
ggcgatccag atcgaagtct tccgcggtta ccgatgtctg ggcattcggg cagccgatgg 13920 
attcccagca caacacatgg aaatgcctag ccagtcgttg cgccaaccgg ctcagcagca 13980 
ggtaggacat gccatagggc ggtagcagca ccagcttggg cgatgcctga gcgcctagcc 14040 
aatacagctc aagctgccgt ccatcggtag tgcagtattg cgacagcctt acccctgcga 14100 
gcgcgtcgtc cagggcggac agatcttgct tttccagata gtgcggcaag caagcacagc 14160 
Gcatgagttt cctaccctcg cttgaccaca tgacaagact gcgccgtcac cggtggctgg 14220 
aaagcttgcg ccggggcgga acacctcacg agtaactcga ttcgaaccca cttccgtctg 14280 
gagagctcgc gtcccctaaa ttcttgtcat cgagttcgcg cagcgataag gggcgcatgt 14340 
cggtccaggt ttcgtcgata tacgccatgc actcgtcctt accgccagca tagccttcct 14400 
tgcgccagcc aggcgggacc tcaagatcgg acggccacag tgaatactgc agttcgtcgt 14460 
tgatcagcac cagataagcc tgttcctcga acgtcatcct aaagataccc ccggaaggct 14520 
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gctgcgaagc acggaagttg ctacatcgca caatgcgatt cagatggacc aagcaaagcg 14580 

actatacatg acgtcacttc gaagatgtca agaaaaatag cgcgtgaaga gcacgtaaga 14640 

gtgatigtgtt tcgcaccgct gtacgtccca tcgccatcgc ggcaaagctt acacgaaaaa 14700 

ttcaccaggg catgcgttca atacgcgggt: caaagcaata tccttgcgct tgcagagcta 14760 

tgttcgtgcg taaagcgcca aggcagtggg gagcaacacc ttgggtttcg gttgaggtgc 14820 

gggtagcaat ttctgcttaa Catccacgcg cggcggtttt tgtcttgccg ggcgtcaact 14880 

gtctcatcga gcagtctggg aggctattct gcgctgcctt atcataaata attacgattc 14940 

gttcacttgg aatctcgccg actacgtagc gcagatcttc ggcgaagatc ccctggtggt 15000 

gcacaacgac gagtactcct ggcacgaact: gaaggaccgc gggggatttt cctcgatcat 15060 

cgtttcgccc ggtcccggct cggtggttaa tgaagcggat tttcacatct cgctgcaggc 15120 

gctggagcag aacgaatttc cggtgttagg cgtatgcctg ggctttcagg gacttgcgca 15180 

tgtctatggt ggccgcatcc tgcatgcgcc ggtgcccttt catggccgtc gctccaccgt 15240 

catcaacacc ggcgacggtt tgttcgaagg catcccgcag cgtttcgagg cagtgcgcta 15300 

tcactcgttg atggtctgcc agcaatcgct gccgcctgtg ctgaaagtga cggcgcgtac 15360 

cgattgcggt gtggtgatgg gcttgcagca cgtgcaacac ccgaaatggg gagtacagtt 15420 

ccaccccgaa tcgatcctca ccgaacacgg caagcgcatt gtitgctaact ttgccaagcc 15480 

ggctgcgcgc cacagtgcac cgttacttgc cgggtcggag caggccggca aggttttaag 15540 

cgtttgcgcg cccgagatgg tgacaccgcg ggtacgtcgc atgctgagcc ggaagatcaa 15600 

gtgccgttgg caggcggaag atgtctttct ggccttgttc gctgacgaaa agcatcgctt 15660 

ctggctggac agccagctgg tctgcagtcc aatggcgcgc tattcgttca tgggagcggt 15720 

gaacgagagc gaggtagtgc ggcattgcgt gcggccaggg agcatggtgc aggaggcagg 15780 
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cgagcggttt cttgctgaga bggatcgggc gttgcaatcg gtgcttactg aggacgtcgc 15840 

cgagcggcca ccgttcgcgt ttcgcggcgg ctacgtigggc tiacatgagct acgaaatgaa 15900 

atcggtgttc ggcgcgccgg cttcacatgc caatgccatc cccgatgcgt tgtggatgcg 15960 

cgtggagcgc ttcgdtgcct tcgaccacgc cactgaggag gtatggttgc tggcgctcgc 16020 

cgatacggag gate tgt egg cattggcttg gctagacgcc atcgagcaac gtatccatgc 16080 

cattggtcaa gcggctccgg cttgcatttc gctaggcctg cgcagcatgg aaatcgagct 16140 

caatcatggt cgtcgcggct accttgaggc aatcgagcgt tgcaaacaac gcatcgtcga 16200 

tggcgagtcc tatgaaatct gtcttaccga cctgttctcg ttccaggccg agctggatcc 16260 

attgatgctc tatcgctaca tgcggcgagg gaacccagcg ccgttcgggg cctatctgcg 16320 

taacggtagc gattgtatcc ttagtacttc accagagcgt tttctggaag tggacggcca 16380 

cggcacgatt cagaccaagc caatcaaggg cacctgccgc cgtgccgagg atccccaact 16440 

ggaccgtaac ttggccatgc gcctggccgc ctcggaaaag gaccgagcgg aaaacttgat 16500 

gatcgtcgac ttgatgcgca acgacctaag ccgcgtggcg gtigcccggca gcgticaccgt 16560 

gcccaagctg atggacatcg aaagctacaa gaccgtgcat cagatggtca gcacggtgga. 16620 

agcgaggctg cgcgccgatt gcagtctagt cgacctgctt aaggcggtgt tccccggcgg 16680 

ctcgatcacc ggcgcaccga agttgcgcag tatggagatt attgatggcc tggagaatgc 16740 

gccgcgtggc gtgtattgcg gcagcatcgg ctacctgggc tacaactgcg tcgccgacct 16800 

aaacattgcg atccgcagtc tttcttatga cgggcaggaa atacgtttcg gcgccggcgg 16860 

cgccatcacc ttcctgtccg acccgcagga tgagttcgac gaagtgttgc tgaaggcgga 16920 

ggcgatcctc aagccgatct ggcattatct acatgcgccg aacactcccc tgcactacga 16980 

gttgcgagag gacaagctgc tgctagccga gcactgcgtt agcgaaatgc cggccaggca 17040 
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ggccttcatc gaaccatgag ccgcggcgag agaggtccat gcacaccatt gtgaacggcc 17100 
ttcccgcttc gtccatagcg attttcgatc gcggcctaca gtacggcgat ggattgttcg 17160 
aaaccctacg gctaggcatc gcgctgccgg acatccgtca gcaggtctgc gaggccatcg ' 17220 
cgactggcgc cgcgccgcgt gccgtggcga aaaactgatc gtgacgcgtg gcagcaccga 17280 



Gcccacgctt acgcgcgaac caggacgggc tgctgatgga tacggccggc cgggtggtcg 17400 

agggctgcac cagcaatctg ttcctcgtcg agaacggcca tctggtgacg cccgacctgg 17460 

gcgtggccgg cgtcagcggg atcatgcgag gcagggtgat cgaatatggc cggcagcacg 17520 

gtctcgcctg cgcggtaaag cacgtctatc cggaccagct agtgcgtgct caggaggtgt 17580 

ttctgactaa cgccgtgttc ggcattctgc tggtgcgcag cattgacgct cacagctacc 17640 

gcatcgatcc tgttaccctg cgtttgctcg atgccctgtg tcagggcgta tatttcaccg 17700 

aacggtcact acatcaggtt tccacccatg ccggccaaga cccttgaaag caaggattac 17760 

bgtggagaaa gcttcgtcag cgaagatcgc tccgggcaat cgctggagtc gatccgattc 17820 

gaggattgta cgttccgaca atgcaacttc accgaggctg agctcaatcg ctgcaagttc 17880 
t 

cgcgaatgcg agttcgtcga ctgcaacctg agcctcatca gcattccgca aaccagcttc 17940 

atggaagtgc gcttcgtcga ctgcaagatg ctcggtgtca actggaccag cgcacaatgg 18000 

ccatcggtga agatggaggg ggcgctgtcg ttcgagcgct gcatcctcaa cgacagcttg 18060 

ttctacggcG tatacctggc cggggtaaaa atggtggagt gccgtatcca cgatgccaac 18120 

ttcaccgaag ccgactgcga ggatgcggac ttcacgcaga gcgacctgaa gggcagcacc 18180 

ttccacaaca ccaaactgac cggcgccagc ttcatcgatg cggtgaacta ccacattgac 18240 

atcttccaca acgatatcaa gcgggctagg ttcagcctgc cggaagcagc ctcgctgctc 18300 



gcgcggctat 



cgttgtcctc 



tggcggtggc gcccaactgg gtgctcagcc tgaatgaggc 



17340 
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15 
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aacagcctgg atatcgagct gtccgattga gcctggccat tcgctaccgt gtgactgatg 18360 
caacctggcg gatgcatccg ccgtcgccgt tgaacacgca gcagaaagac tggctgacac 18420 
gcggtggttc gttgaccgcg cacctgcgcc tgttggggca ggtacaggtg caagtgcaac X8480 
gggagcacaa agccatggcc tggctggatg aatatcgggt gctcggactg tcgcgctgcc 18S40 
tgcttgtatg ggtgcgtgaa gtggtcctgg tggtggacgc caaaccctat gtctatgcgc 18600 
gtagcctgac gccgctgacc gccagttiaca acgcctggca ggcagtgcgt agcatcggca 18660 
gtcgcccgtt agctgatctg ttgttccgtg atcgcagcgt gctacgttcg gcgttggcga 18720 
gtcggcgcat caccgcgcag catccgctgc accggcgcgc acgcaacttc gtggcacagt 18780 
cgcatgcgac gcaagccctg ctggcgcgcc gctcggtatt tacgcggcaa ggcgccccgt 18840 
tgctgatcac cgaatgcatg ctgccagcgt tgtgggcaac gctggaaccg gtggcagctc 18900 
cgcgccaggc gagtctgagt gcggacggcc cttgccggca ttcagcgcag atcgtctcgc 18960 
ctgagtcgat gctggaatag ccttggcatc gaagccctcc gatcaggcat cgaggtccgt 19020 
25 caccagcatg cgcgccagct cacgcagcgg gtcgccttgt aacatgctgt agtgattgcc 19080 

cgccgcgctc ttgatctgcg aaagtggtac gtaacctgtg atatccggca gtacctcgct 19140 
acccccgcgt ggcttggaca tgtcggcata ggacacgtgt accgcagtct gtgcctggta 19200 
cagatgagcg ttaggctgca ggcactgcgg ctcgaaaccg gccaacagcc ccaggtgata 19260 
gcgcgtaacg cgcaattgct cggccagcgg tggccacatg cggctttcca gagtaaacct 19320 
35 caggtgtgcg agcagtttgt ccagtgatgc ctggtggcgg ctgtagtcga acacgtgctc 19380 

gtcgtcaccg tcggcgaaca gcagctgccg ggcttcgtct atttcagcct gctcgaagcc 19440 
gcgcttggcc aacgcggcca acgtattgag cgctgccacg aaggtaagct gccggggctc 19500 
gcgcgcatgt accggtatca ggctgctatc gagcaggccc acgtaatcga cgcgcaggcc 19560 
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gcgccgctgc aattgctcag ccacagccag ggctagcacg ccgccrggagg accagcccag 19620 
caagcggtag ggcgcaccgg tcgggccagc cagcaacgca tcgcagtagt gcgcggccag 19680 
gtcggacaaa tgcgcgaagc ggcgcaccgg ttcgcattgc aggccataga ccctggcaga 19740 
gtgccctagg gcggcagcca gatcgatgta gcaatgaatc tggccgccga tcgggtggat 19800 
ggcatacacc gccgcgcgtt cggtgcgtag gctaagcggt acgataagac ttaccggcat 19860 
gctgccggct tggctttggc gcatgccgcg ctcgacgaca gcggcaaaat cttccagcac 19920 
aggggattcg aacagcgtgt tgacccgcac ttcgatgtcg aaactctggc ggatacgcga 19980 
gaacaactgg gtggcaagca gcgagtgacc gccgaggttg aagaagttgt cgttgaggct 20040 
cacgcgtaag ggggcggcct gtgcgggggt cagcagttcg ctccacagct tggccagggt 20100 
gatttcgacc tcgctgcgcg gagcgaggta gtcgctgtcg ctgctggcgg cttgcggctc 20160 
gggcaggctc aaggtatcga gcttgccatt gggcaagcgc ggaagcgccg gcagcgactg 20220 
gaagcgcgtt ggcagcatgt aggtaggcaa gcgttcctgc agcagcttgc gcagctcgtc 20280 
gaggttcagg acaccctggc gtggcaccac gtaggccaac agttccggcg tgggcgagcc 20340 
ttgcggccaa ccgatcacgg cggcctcggc gacctgcagg tgggccgcca gggctttctc 20400 
cacctggcgc acgtccacgc ggtagccgcg gaccttgacc tcgtagtcgc gccggccgag 20460 
cagttccagc gtaccgttgt ccagtaggcg ggccatgtcg ccggtcctgt acagacgcga 20520 
gccggggggg ccgtagggat tggcgatgaa gcgcgcggcg gtcaggccgc cctggcgcca 20580 
atagccgtgc gtaatgccga ggctttcgat gtgcacttcg cccatgatgc caggcggcaa 20640 
cggtcgcagt tgttcgtcga gcacatggac cttggtattg gcgatgggcc gtccgaccgg 20700 
gacgaagccg ctgccgctgt gctgctcggc cggatcgcaa taggtcatgt cgttgatttc 20760 
ggtacacccg tagatgtacc aggccgtgca ggcaggcagc agcgtcctga gccgttgcag 20820 
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cagttccgcc gggcagggtt cgatggagac gaagagctgg cgcagtcgcg ccagccgceg 20880 

cggtgcctca gcaacgtggt cgagcagcgc gttgagctgg gaaggaaagg tatacaggcg 20940 

tgtgatctgc caggttt^cca gcgcgcgcac gaaagcgggg atgtcacgca cggtatcctc 21000 

gtcgatgaac acctgcggta cgccagcaag taggccggcg agcagttcct tgaccgaaat 21060 

ggcaaaggcg atcgaggtct tttgcgccac ccgctccccg gcctcgaaag gcgcacgtgc 21120 

ccacagcgca tgcagccagt tgaggatttg ccgatggggc accatcaccc ccttgggacg 21180 

accggtggaa ccggaggtat acatcacgta ggccagctgc gccggatgca gggcatgcgg 21240 

cagtggcgta tgcggttgac gagcgatggc ggcatcgtcc aggcgcagcc gcggtacttg 21300 

gatcagttgc ccgtcgatgt ccttgccgca gagcaacagc cgtggctgcg cgtcgtcgag 21360 

gatctgctga atgtaggtgg tggggtaatg cgggtccaac ggcacgtagc agccaccggc 21420 

cttgagcacg ccgagtaggg caatgaggaa atcgggcgag cggccgaacc acagggcgac 21480 

gcgctcctgc gggcgcaggc cgcgctcgat caggcaatgc gccaggcggt tggcgtgttg 21540 

gtccagttgg gcatagctca actgtcggtg ttgatcggca caagccagtt cctcggcgtg 21600 

cagtgccact tgcgcatcga acagatccag cacactgcgc gaggtatcca gagtatgagg 21660 

ggtgaactcg gtgcgagcga ccggcagcga aaaatccgag aggcggcagc gcggttcttc 21720 

cagcatccgc tccagcactc tttggtggbg ggccagcatg cgctgaaccg tagccgccga 21780 

aaacagctcc gcggcgtatt cgacagtgac ttccaggtgg cttccgtcgc cgatgaactg 21840 

caggtccagc tcgttgggtg tggtgcgttc gccaaattcc atctgagcgc tgaggaagat 21900 

ctgggcaaat gcattgacgc cttcggtggc gaaattttgg tgtcggagca tgatcggcac 21960 

gagcgggatc tggctgctgt cacgcggttt cttgagagcg cttaagacat gctcgaacgg 22020 

cagtgcgcga tgcgcgtagg cgtccagcac ttgctggcgc acgtgcbgca gaaaatcctc 22080 



Application of Royer. et al. 

ggcaaaggc9 tgactgccca agtttaggcg taccgccagg atattgacga aaaagccgat 22140 
cagattctcg gtttccagct gatcgcgtcc ggcgctggta gtacctaagc agagttctcg 22200 
ccggccggtg tactggtgca agacgatcgc caggctcgcc ataagtgtca tgaacaaggt 22260 
gacgcgccgt tcctggctga atgcggcgag acgcgcggcc aaggcgtcgg gataggtcag 22320 
gtgtagtatg ccagcacgcc aagctcgatt agccgggcgt ggaaaatcgt agggcaaggc 22380 
cagcccttct tcgtaaccat gcaaacgctg tttccaataa tccagatcgg cgctgaaatc 22440 
ctgtacgcgc tgccatgtag catagtcggc atattgcagt agcagcggtg gcagtgccgg 22500 
tggcgtctgc tgtagcgcgg ctatatagaa agcacgtagg tcgtgaaaga tcaggttaat 22560 
cgaccagccg tcgcagatga tgtgatgcat gttcatcagg aacacgtggt aatcgtccga 22620 
tacgcgcagc accgatacct tgagcagcgg gccgtgggca agatcgaata cgtgcgcggc 22680 
gtgctcggcg actaggcgtg gcacttctgc gggtgtcgct gtgatgcaag gcactgggac 22740 
ctgcatggcg tcggcgatgt gctggctggg ataatcgccg ccagcgcaag ttgctatgcg 22800 
ggtgcgcaag gtttcatgcc tggccaccag cgcctggatc gcctcgcgca gcgctgacat 22860 
cgagaaatcg gcactgcgta aatggcaggc gaaggcgaca ttgtaactgg tacgttgctc 22920 
gggcatgtgt tcatgcacga accacaggcg ctcctgctga tagctcagcg gaaegggacjc 22980 
atcgcgcacg gcacgggaag agatggtgtt gccgccagtc ggagcctgtt gttgccgcgc 23040 
ttcgttgacc actcgcgcaa aatcttccag cacaggggat tcgaacagcg tgttgacccg 23100 
cacttcgatg tcgaaactct ggcggatacg cgagaacaac tgggtggcaa gcagcgagtg 23160 
accgccgagg ttgaagaagt tgtcgttgag gctcacgcgt aagggggcgg cctgtgcggg 23220 
ggtcagcagt tcgctccaca gcttggccag ggtgatttcg acctcgctgc gcggagcgag 23280 
gtagtcgctg tcgctgctgg cggcttgcgg ctcgggcagg ctcaaggtat cgagcttgcc 23340 




d>0 
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attgggcaag cgcggaagcg ccggcagcga ctggaagcgc gttggcagca tgtaggtagg 23400 

caagcgttcc tgcagcagct tgcgcagctc gtcgaggt:tc aggacaccct ggcgtggcac 23460 

cacgtaggcc aacagttccg gcgtgggcga gccttgcggc caaccgatca cggcggcctc 23520 

ggcgacctgc aggtgggccg ccagggcttt ctccacctgg cgcacgtcca cgcggtagcc 235B0 

gcggaccttg acctcgtagt cgcgccggcc gagcagttcc agcgtaccgt tgtccagtag 23640 

gcgggccatg tcgcoggtcc tgtacagacg cgagccgggg gggccgtagg gattggcgat 23700 

gaagcgcgcg gcggtcaggc cgccctggcg ccaatagccg tgcgtaatgc cgaggctttc 23760 

gatgtgcact tcgcccatga tgccaggcgg caacggtcgc agttgttcgt cgagcacatg 23820 

gaccttggta ttggcgatgg gccgtccgac cgggacgaag ccgctgccgc tgtgctgctc 23880 

ggccggatcg caataggtca tgtcgttgat ttcggtacac ccgtagatgt accaggccgt 23940 

gcaggcaggc agcagcgtcc tgagccgttg cagcagttcc gccgggcagg gttcgatgga 24000 

gacgaagagc tggcgcagtc gcgccagccg ctgcggtgtc tcagcaacgt ggtcgagcag 24060 

cgcgttgagc tgggaaggaa aggtatacag gcgtgtgatc tgccaggttt ccagcgcgcg 24120 

cacgaaagcg gggatgtcac gcacggtatc ctcgtcgatg aacacctgcg gtacgccagc 24180 

aagtaggccg gcgagcagtt ccttgaccga gatggcaaag gcgatcgagg tcttttgcgc 24240 

cacccgctcc ccggcctcga aaggcgcacg tgcccacagc gcatgcagcc agttgaggat 24300 

ttgccgatgg ggcaccatca cccccttggg acgaccggtg gaaccggagg tatacatcac 24360 

gtaggccagc tgcgccggat gcagggcatg cggcagtggc gtatgcggtt gacgagcgat 24420 

ggcggcatcg tccaggcgca gccgcggtac ttggatcagt tgcccgtcga tgtccttgcc 24480 

gcagagcaac agccgtggct gcgcgtcgtc gaggatctgc tgaatgtagg tggtggggta 24540 

atgcgggtcc aacggcacgt agcagccacc ggccttgagc acgccgagta gggcaatgag 24600 
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gaaatcgggc gagcggccga accacagggc gacgcgctcc tgcgggcgca ggccgcgctc 24660 

gatcaggcaa tgcgccaggc ggttggcgtg ttggtccagt tgggcatagc tcaactgtcg 24720 

gtgttgatcg gcacaagcca gttcctcggc gtgcagtgcc acttgcgcat cgaacagatc 24780 

cagcacactg cgtgaccaat ccaaggcgag cgcggtatcc ggactggccg cggtcagcgc 24840 

gacgtcttcg gcatccagta gcgacatgct tgatagtttc atggggcacc gggtcggcag 24900 

ggttaaacgt gcagcttgag catggcttcg ccatcgctgc gcgcagcgat gtcaggggac 24960 

gattgttcgg gcgaataagg ttcggccatg cacacgagga tcttgcggct gccctcgtag 25020 

ggctcgcgtc cgtgcgaaac cagcatattg tcgatcagca gcacgtcgtc ccgatgccag 25080 

tcaaaatgga tcttgtgctg ggcgaaaact gtgcgcacat ggtcgagcat ggcggggtcg 25140 

atcggcgtgc catcgccgaa ataggcgttg cgcggcagtc cctgctcgcc gaagaacgac 25200 

agcatcatct tctgcgcagc tgcctccagc gcagtgtaat gaaacaggtg tgcttggttg 25260 

aaccaaactt catcgccggt cgccggatgg cacgcaaagg cccggcagat ctggctggtg 25320 

cgcaggccgt cgccggtcca ttcgcattgc atgtcgttgc gggcgcaata agcttctact 25380 

tcctgcttgt tgcgggtgtt aaacacgtcc tcccatggca ggtcgacccc tgcacggtag 25440 

ttcctgacgt agcgcacctg tttgcgcgca aagatttcgc gcacttgcgg atcgatagcg 25500 

gctgtgacct tgagcatgtc agccaacggc gtgcagccgc cctcgctggc cggctgcacg 25560 

caatggaaca gcagtttcat cggccagacg cgctggtagg cgttctcgca atgttgcgct 25620 

atggacagct gcctgggata ttcggtggcc gtgtagacat: gctggccgac gtcggtgcgc 25680 

ggtgtggaac gataggtata ggccagtcgc tcatcgaaga aacagcgtga gatctgctcc 25740 

aagccaccag ggtgcgcaaa gccacggaac agtaacgccc tgtgttgcca tagcagggta 25800 

ggccacgtcg cgcggtgcgt ggcattccaa tcagtcagcg tggcctcggc cgagtcggcc 25860 
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ttigatggtca ggggaagatc agcgtttgtg tgcatggggc acctaggttc gagcatggcc 25920 

gacgaagaac aggtgggatt gcaggtgctt ggcgctagcc agtgtctcca acagcgcagg 25980 

ccggattacc ttgccgctgc aggtacgtgg gatcgtgctc acttcgacga atagatgggg 26040 

atagtgatgc ttgcctagcg cgttcttgca caaggcgcgt aaggcggccc atagcgcccc 26100 

tgtatcgatg ctggcatcga caggcaccac gaaggcggcg gggcggggca agccgaactc 26160 

gtcttcgatc aggcagatcg cgcactcctt cacgcaggcg tgcgtctgga tgacgctctc 26220 

cagcgtctca ggcgaaagcc aacaaccgtt gatcttgatg gcggagccca ttctgcccaa 26280 

gttatggaag cgccccttgg cgbcggcaaa gaacaggtcg cgtgtatcga accagccgtc 26340 

gacgaacaat tgggcactga gtatggggtc gccaacatag cccctcgtca gcgtactgcc 26400 

cctcacccac aggctgccga cttcgcctat gcggcagatc tctccctgct tgttcaccag 26460 

cttcacaaca aagcctggta ccggcgtgcc agtgcaaccc atgagcgcgt ggcctggccg 26520 

attggagatg aaggtggaca gtacctcggt gcagccgata ccgtcgagca cttcgacctg 26580 



ccaacgcgtg ctgatcgcat gaccaagcct cgccggcaag ctttcgccgg ccgatatgca 26640 

caggcgaagc gccggccaca ccgcatccgg cgcggcctcg gcaagcaaca gcttgaacac 26700 

ggcgggcacg gcgagcaata cagtgacgtg gtaagtgtgg atggtttgcg cgatctgcct 26760 

gacgctaagc ggcgcggcaa tcacatggct gacaccagcg agcagcgaca gcatcaggtt 26820 

gttcaggccg taggcgaaaa acaaccgcga cggtgtatac atcacgtcat cgctgcgcag 26880 

cccgagcacg gcctgctggt agttgagatg gcagtgcata aaatcggcat gcgagtgcgt 26940 

taccgccttg ggcgtaccag tggagccgga cgtgcatatc atcaccgcgg gtgcatcggc 27000 

ggagcagggc gcaaccacca gttcgtcgtt ttcgatcacc. ggcatcaggc tcgtcagctc 27060 

taaggtcggc aggtggcgca acgcggcatg atgggaaggc ggcagttcgg catcgatcag 27120 
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cacgaggcga ggcttgatgg tcttgagcgt ggtctcaaag tggacgagcg acacaagctc 27180 
gtttatcggc gcaaagacca agccaccggc caagcaggcc agcatcaggg caacgcccgc 27240 



gtgccgggca taggtcgccg cgcgagagcg caactggcga tagctgaagg cctgctggcg 27360 

caacggatcg atcatttgcg ccgtcgaagc cagatgcgct gcggaaaaaa tttgcgcgca 27420 

cacgttgacc Cggaccgacg ggaggaagcc gataggcgca caggcgaaca ccgccgtgct 27480 

tgcgtccgac caggacggca tcggcccatc ggtcgagcgt gcgaaccagc tcgcgggcaa 27540 

atgactggca acctgacata gtttgccgtg gtcgcaggcc agcagactgg tatccacctc 27600 

gatcaggtct tcgacgcagg aaagcgcagg caaagagatc tgcgccgcgc tgccgcatac 27660 

ggcactatcg cgcaagtccg gcaggttcct ttggcggtgg tccgcatgcc atagcagcag 27720 

gccatcgctg cgtcgcgtgg ccaaggcttc cagtgccatg cccagggcgc cttgcaaccg 27780 

gtccagttgc ttgggttcct cgatacggcc caaatccagt gccagggtgt tggcgccggg 27840 

gggcgattcg gacagcacga tttggtgccg atgcttgagg taatcgcaaa tcaggccggc 27900 

cagcttgcct aaccgtgcat attcccgtag caggctaccg aagctgccac aggggtaagg 27960 

tgcggcatag tcaatggtta tgtgctggcc gatcggcgtg tcgctgacat cgatacgcaa 28020 

gccaggataa tcgcgccgcc attgatgcag cagcgtggtc cattgacgag cataggcctc 28080 

gttgcgccct ggccgcgtct gacctaccga ccaacgcaaa tctagttcgg tgccagagtg 28140 

catcggaaga tttgtcagtg ggctatccat aagcgttctc gggtaaggcg atcgacgcat 28200 

cgagctcggc tgtgtgcact ggtttggccg gtacgccagc gcgagagact gttcgcaagc 28260 

ttcagcgttt cagcgcgtgg accagataac tttgcggcac gccgtgggtg ccgcgcggtg 28320 

cgaccggaaa cggccaacgc cccatttgct ggccagcggc ggcgatggtc aacacccctg 28380 



taggct:gt.cg atagcaatca gcgccaccgc atcaccgctt tgcaggccaa gcagactcaa 



27300 
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gctcccagcc cagtggttgg atcaggcttt ccggttcgtc agtgccaaac tggcgagcca 28440 

tcgcgtgcag cactcgggca ttgggcgagt tgagcataga gaggccgata acatcgaaca 28500 

acacgctgct gcccctggca ctcaatgcat cgatgcgcgc gaacagcagc atcactgcct 28560 

cggcgctcaa gtagcacagc aagccctcga ccagccacaa ggtggcggcg ctgccgacga 28620 

atccactctc cttaagtgcc tggggccagt cttcgcgcaa atcgatcggt agcgcaatgc 28680 

gctggcaaac gggctgggcg tcatggagtt tttcgtgctt gtcggagagg acatccatgt 28740 

ggtcgatctc gtagacccgg gtatcggacg gccaggggag acgataagcg cgtgcatcca 28800 

15 taccggcggc caggatcacc acctggccaa tgccttcact aaccgcctgc atgatcttgt 28860 

cgtcgagcca acgcgtccgt acctcgatcg ccggaggcat cggtacgttc tggttgttgc 28920 

gtctgagctc ttcaacgaat tcatcgccgg ccagacgccg tgcgaaaggg tcatggaaca 28980 

20 

gcgcctgctc ccgctcgctt tccagcgccc gcatgcctgc cacccataaa gcggttctct 29040 

cgatatctct catgcatacg ctccggttcg tggtcggctt gcgccgatgc atcatagata 29100 

25 tgcatgactc gattcgcggc accgtgcctt gatggtggct gcgaagcgaa aacaataacc 29160 

aaagggtggt gctcgacggc tttactgtag cgacaccttg tccatcgcct tiacggatggt 29220 

ctgatccacg caagcgaaag atgagataaa ccacatcagc tgtcaacgcc gatttaaatt: 29280 

30 

tgacccactt tcctttgaat cgtcgaagta aatctgaccc acccggggtc ttpcatcgtc 29340 

gggctgctag gctgcgcagg gcaaagcccg tcgcagccca gcagccctgc gccggctcac 29400 

35 gcccgaaggg caggtagccg atctcgtcga ccaccagcag cttcggccct agtaccgcgc 29460 

gattgaagta gtccttcagc cggttctgcg ccttgaccgc tgccagttgc atcatcaggt 29520 

cggccgcggk gatgaaacgt gccttgtgcc ccgccatcac cgcacgctgg cacagcgcca 29580 

gggcgatgtg ggtcttgccg acaccgctgg ggccaagcat caccacgttc tcggcgcgct 29640 
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cgacgaaggt caggtggccg agctcgacga tctgcgcctt cgaggcgccg ccggcctggg 29700 

cccagtcgaa ctgctccagc gtcttgatgg acggcatcct ggcaagtcgc gtcagcaccg 29760 

tgcgcttgcg ctcttcacgc gcgagctgtt cgcttgccag caccttctcc aggaagtagc 29820 

tggcatcctc gcacgcggcg gcctgtgcga gtgcttgcca gtccgagctc aggcgtgcca 29880 

gcttcaactg ctcgcacagc gcggcgatgc gcgcacactg caggtccatc acgccacctc 29940 

cagcagggtg tcatacacgg ccagcggatg ctgcaggttt tccactggca gggccactgg 30000 

ctgtcgtaag ggaagcggtg ccttgagcgc cggtgcggac agtataacga cacgttcctt 30060 

IS ggccaagcgc actgtcggca cggccttgct gatgccgccc atgtagccgc gcgcctggat 30120 

ctcgcgtagt agcaccacgc tggccgggat ccatcgaggg cgcgcttgcc caatgcgctc 30180 

atgcagataa ctcttgtagc cgtccagttt gcaggcgtat tgttaggcgt caccgctcgc 30240 

20 

gcggcgatcc ccaataaggc gggtatgaga cgcgcatggc cgcccttccc gcaggcgtgc 30300 

tgtcgctcta ttgcttacct catgcagaga tcgccaatgt cgccgttaca gcaaacgctg 30360 

25 ctaacccgcc tcgccagtgc ggccgcctcc cggacaatga tcgagtttcc gcgtccggag 30420 

cacgcatcgc cacaatgttg cgacgatgcc gagcttgcgc gactgatcgt gcagttgtcg 30480 

gcgggactgc aaccgctggc gatgccgggt acctacgtga tcattgccgc gccacatggt 30540 

30 

ggcttgttcg cggcagccct gcttgcctgt ttgcatgcca acctggtggc ggtgccgttt 30600 

ccactggatg ttgctcagcc aaatgagcgg gaacaggcca ggctggagac gatccacgca 30660 

35 caattgatgg agcatggcaa tgtagcggtt ctgcttgacg atgtcgccga tcgcagtgcc 30720 

ttcgcgcgca tggcgcatgc tgcgggcacc ttcctggcga ccttcgccga tctaaagcgc 30780 

gaatcgacca gcgcctcctt gtgcccggcg tcgccttcgg acgccgcctt gccgttgttt 30840 

acctctggtt cctcgggtga gtccaagggc atcctgctta gccaccgcaa cctgcatcat 30900 
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cagatccagg ctggcatccg gcagtggagc ttggacgagc atagccatgt ggtgacctgg 
ctttctcccg cgcacaactt cggcctgcat ttcggcttgc tggcaccctg gttcagtggc 
gcgacggtca gtttcatcca tccgcacagt tatatgaaac gacccggctt ctggctggag 
acggttgcgg ctagagacgc cacgcacatg gccgcgccga acttcgcgtt cgactactgc 
tgcgactggg tgatggtcga gcagcttccg ccgtctgcgt tgtctacgct tacgcatatc 
gtgtgtggcg gcgagccggt gcgcgcctcg accatgcagc gcttcttcga gaaattcgcc 
ggactcggtg cgcgtacgca gactttcatg ccgcacttcg gcttgtctga aaccggtgcg 
ctgagtacct tggacgaggc gccccaacag cgcgtcttgg aactagatgc cgacgccttg 
aacaaacgca agcgcgtggc ggcaggggcg agccaggcgc gtgtgacagt gctcaattgc 
ggcgccgtcg accaagatgt ggagttgcgt atcgtctgtc ctgaaggcga gacgttgtgc 
agaccagatg agatcggcga aatatgggta aagtcgcctg cgatcgcccg tggctacctg 
tttgcgaagc ccgccgatca gcgacagttc aactgcagca tccgtcatac cgacgatagc 
ggttactttc gtaccggcga cctgggtttc attgccgatg gctgtctgta tgtcaccgga 
agggtaaagg aggtgctgat catacgcggt aagaatcatt accccgcaca tatcgaagcc 
tcgatcgccg ctaccgcatc gcctggcgcg ctgatgccgg tggtgttcag catcgagcgg 
caggacgagg agcgcgtagc tgcggtgatc gccgtcaatc acccgtggac gccggcagca 
tgcgccgcgc aggcacacaa gatccggcaa caggtagccg accagcatgg agtcgccctg 
gcggagctag cctttgccga acaccggcac gtgttcggca cctatccggg caaactgaag 
cggcgcctag tcaaggaagc ctatgtcaac ggccagctgc cgttgttatg gcatgagggt 
aagaaccggg acgtaccagc ggccgccgcg gacgatcggc aggcgcaaca cgtggcggac 
ctgtgtcgga aggtcttttt gccggtgttg ggtgtcgcgc cgccgcatgc ccaatggccg 
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ctgtgcgaac tggcgctgga ttcgctccaa tgcgtgcgtc ttgccggtgc catcgaagag 32220 

tgctacggcg tgcctttcga acccacgttg ctattcaagc ttgagacggt cggggcaatc 32280 

gccgaatatg tcctggcgca cggacgtcag gcgcccacgc cgacgcgtgc gccggtggca 32340 

agcacaacat gctcagagga accgatcgcc attgtggcga tgcactgtga ggtgcccgga 32400 

gcgggcgaga acactgaagc attgtggtcg ttcctgcgga gcgacgtcaa cgcgatccgg 32460 

ccgatcgaat caacgcgccc ggacttatgg gcagcgatgc gcgcctatcc cggcctcgcg 32520 

ggcgaacagc tgccgcgcta tgcgggtttc ctcgacgacg ttgatgcttt cgatgctgcg 32560 

tttttcggta tctcgcgtcg cgaggccgaa tgcatggacc cgcagcagcg caaagtgctg 32640 

gagatggtgt ggaagctgat cgagcaagcc ggtcacgatc cgctgtcctg gggcggccag 32700 

ccggtcggcc tgttcgtggg tgcgcatacg tccgactatg gcgagctgct ggcgagccag 32760 

ccgcaactga tggcccaatg tggcgcttac atcgattcgg gttcgcattt gaccatgatt 32820 

ccgaaccggg cttqgcgctg gttcaatttc accggcccca gcgaagtaat caacagcgct 32880 

tgctccagct cgctggtggc gctgcatcgg gcggttcaat cgctgcgcca aggcgaaagc 32940 

agtgtcgccc tggtactcgg cgtgaacctt atcctggctc ccaaggtgct gttagccagt 33000 

gcaagcgcgg gcatgctttc gcccgatggc cgctgcaaga cgcttgacgc cgccgccgat 33060 

ggcttcgtgc gttcggaagg gatcgcaggg gtgatattga agccactggc gcaggcgctg 33120 

gccgatggtg acagggtcta cggtctagtc cgcggcgtgg cggtcaacca tggcggccgt 33180 

tccaattcct tgcgtgctcc caacgtcaac gcgcagcggc aactgctgat ccggacttac 33240 

caggaagccg gtgtcgagcc ggccagcgtc ggttatgttg aacCacacgg cactggtacc 33300 

agcctgggtg atccgatcga aatccaggcg ctgaaggaag ctttcattgc gttgggggca 33360 

caggccgccc cgtcaaactg cggcatcggt tcggtgaagt ccgcgctggg ccatctagaa 33420 
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gccgctgcag gcctgaccgg cctgatcaag gtgctgctga tgctcaagca cggcgagcag 33480 

gccggcacgc gccatttcag cacgctcaat ccgctgatcg atttgcgagg tacgtcattc 33540 

5 gaagtggtgg cgcagcatcg cgcatggccg tcgcaggtcg gcattcacgg cacactcttg 33600 

ccgcgtcgcg cgggtatcag ctcattcggc ttcggcggcg ccaatgcgca tgcgatcgtg 33660 

gaagagcatg tcattgccac gcccccctcg acgagctccg ctggcggccc ggtaggtatc 33720 

10 

gtgttgtcag ccggtagtga agctgtcttg cggcaacaag tgctggcctt gtcagcctgg 33780 

ctaaggcagc aatcgccgac acccgcgcaa atgatcgatg tcgcctacac cttacaggta 33840 

15 ggacgcgcag ccctgtcgca caggttggct tttagcgcga cggacgccga gcaggcattg 33900 

gcgaggcttg agggtcgtct ggcgggcgtg atggatgccg aggtccatca cggtgtcgtg 33960 

gatgctgccg caacggctcc cgaacatggg cggcagacgc gcgaaggtct tgccggtttg 34020 

20 

ctgcgagcct ggactcaggg cgtgcgcgtc gattggtcgg cgctgtacgg catacagcga 34080 

ccgcagcgcg ttagcctgcc tgtctacccc ttcgctaggg aacgctattg gctgcccggc 34140 

25 caggctatgc atgccgctgc ggacgctcat ccgatgctgc agctgttgca tgccaatgcc 34200 

aaactacatc gctacgcctt gcgtaggtcc ggctgcgcaa gctttcttgt tgatcattgc 34260 

gtggatggtc gacaggtact accggcagcc gtgcaactgg aattggtgcg cgccgtggcg 34320 

30 

cagcgggtca tggcgcagga tgagggttgt atcgaactgg cgcaggtcgc ctttttgcat 34380 

cccctcatga tggaggagac tgagctggag gtcgaaatcg aactgtcgaa gagcgatcaa 34440 

35 gatgagttcg atttccaact tcacgatgct caccgccaac aggtctttag ccaggggcac 34500 

gtacgtcgcc gggtctatac ggcgacaccg cgcttggatt tagcccagct gcaaaagctt 34560 

tgtgccgagc gcgtgttgtc cggcgaagac tgbtatgcgc acttcaccgc ctgcggattg 34620 

40 

cagctcggcg accggctcaa atccgtgcaa tcgatcggct gcggacgcaa tggcgagggc 34680 
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gagccgatcg cattgggtgt cctgcgcctg ccaccatcaa gcgttgaaga cagccatgtg 34740 
ctgcctccta gcctgcttga tggtgccttg cagtgtagcc ttggcttgca gcgtgatgtc 34800 
gagcacatcg ccatgccata cacgctggag cggatgacgg tgcatgcgcc gattcctccc 34860 
gaggcctggg t:gctgctgcg tcacggccat gcagccagac agtccctgga catcgatctc 34920 
ctggattccg aaggtagggt ctgcgtcagc ctcggcaatt acaccggccg tgcaccgaaa 34980 
gccgttitccg ccgt:cagggc gcttigtcttg gcaccggtct ggcaagcgtt gaccgaaacg 35040 
gcgccggcat ggcccgatcc ggccgaacgc atcgttacgg taggagacga tgcatggcgt 35100 
agtcacttcg gt:ttcgacga gccggccttg tccctggagg acagcgtcga agtcatcgcg 35160 
acgcgactgg gccagagcgg caagt:t:cgat catctagtct ggatcgtgcc gatagccgag 35220 
agtgaaaccg atattgcagc gcaaggttca gcggcgatcg ccggtttccg gt:tggtcaag 35280 
gcgttgcttg cgttgggcta tgcgcatcgc ccgctgggtc tcaccgtgct gactcgccaa 35340 
gcccttacgc ggcagccgtc gcacgcggca gtgcacgggc tgatcgggac gctggccaag 35400 
gaatactgca act:ggaaaat ccgtctgctc gacctgccga gcgtaaaatc ttggccgcaa 35460 
tgggagcaat tgcggtcgtt gccttggcat gcgcagggcg aagccctgat cggccgtggg 35520 
acttgttggt atcggcggca gttgtgtgaa gtgctgccgc tgccgtcgtt ggaaccgccg 35580 
ccgtaccgcg taggcggtgt ctacgtcgtg atcggcggcg ctggcggctt gggtgaagt:a 35640 
ttgagcgaac act:tgatccg cacgtacgac gcgcagctga tctggatcgg gcggcgcgtg 35700 
ctggacgaag gcattgcgcg caagcagacc cggcttgcgt cgctgggccg cgcaccgcat 35760 
t:acatctccg cggacgcgag tgacccggct gccctgcagg cggcacataa tgagatcgtt 35820 
gcgctgcatg gccagcccca tgggctcatc ctaagcaaca tcgtgctgaa ggatgccagt 35880 
ctggctcgta tggaggaagc cgat:ttccgt gacgtgctgg ccgcgaaact cgacgtcagc 35940 
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gtgtgtgcgg cacaggtgtt cggcacggcc ccccttgatt tcgtgctgtt tttttcttcc 36000 
atccagagca ctaccaaggc ggccgggcaa ggtaactacg ccgccggctg ctgctatgtc 36060 
gacgctttcg gcgagctatg ggcgcgccgg ggtttgaggg taaagaccat caactggggc 36120 
tactggggca gcgtgggcgt cgtagcgggc gaggactatc gccggcgcat ggcgcaaaaa 36180 
cacatggctt cgattgaggg tgccgaagcg atgcaggtgt tgtcgcagtt gttgtgtgcg 36240 
ccgttgcaac ggcttgccta cgtcaagatc gacgatgcta acgcaatgcg cgctctgggc 36300 
gtagtagagg acgagagcgt gcaaatccct gtgcacgcac cggccgagcc tcccagaggg 36360 
cagcctggtc ccgtggtcga gttgtcggtg aatctggatg cccggcgcga acgggaaact 36420 
ttgctggcgg cctggctgct tgagttgatc gagcaactcg gtggttttcc gccggcaagt 36480 
ttcgacatcg ctacgcttgc gcaacgcctg cacatcgtac ccgcctatcg aagctggctg 36540 
gaacacagcg tgcggatgct cggcgtgtat ggttacctca gagcgacggg ggaaagccga 36600 
ttcgagctgg ccgacaagcc gcccgatgat gccaggggtg cctggaacgc gcatgtgcac 36660 
gaggccagcg tcgaagccgg tgaagaggca cagcggcgtc tgctcgatcg ctgcatgcgg 36720 
gcgttgccgg cggtccttcg aggcgaacgc aaggccaccg aattgctgtt tccggaaggt 3 67 BO 
tcgatggcgt gggtcgaggg tatctaccag aacaacccgc ttgccgatta cttcaacgca 36840 
caactagtca cgcgactgat tgcctacttg agacgacgac tagagtcgac gcctacggcg 36900 
cgcctgaagc tgtgcgagat cggcgccggc agcggtggta ctactgcaag cgtgctacaa 36960 
cagttgcagg catatggtga gcatattgag gaatatctct ataccgacct gtcgcctgtc 37020 
ttcctgcatc atgcggaaaa acactatcag ccacgagcgc cttatttgag gaccgcctgt 37080 
ttcgacgtag cgcgcgcgcc gacggcgcag gccctggaat ctggcggcta cgacgtggtg 37140 
attgccgcca acgtactgca tgctacgcgc gatatcgcca agaccttgcg caatgcgaag 37200 
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gcactcctca aacctggcgg tctgctcttg ctcaaogaag tgatcgagcg cagcctcgtc 37260 
ttgcacctga ctttcggtct gctggagagc tggtggttgc cccaggacaa gatcttgcgc 37320 
cttgccggct cgccgttgct ggcttgcgcc acctggcgca gcctgctgga ggctgagggt 37380 
tttgcggggc tgagcgtgca cagggcgcaa cccgatgccg ggcaggccat catctgtgcc 37440 
tacagcgatg ggatagtgcg gcaagccagt acgatcgagg ttgcgcggaa tgaaaaagta 37 500 
accgttccgt cgcagccggc ggaagccggg gaatcgccgc tggatctggt caaaaaactg 37560 
cttggacgca ttctgaaaat ggatccggcc acactcgata ccagccaccc gctggagtac 37620 
tacggtgtcg attcgatcgt ggcgatcgaa ctggctatgg cactgcgcga gacattcccg 37680 
ggttttgaag tcagcgagct gtttgaaacg caatccatcg ataccttgtt gggctctctt 37740 
gagcaggctc ctctccttgc taccctcaca gctccgccgc aacaagacat gctgcagcag 37800 
ctgaaacaac tgctggcgcg tacgctgaag ctggacatta cgcagatcga cacgagcaag 37860 
acgctggaga gctatggtgt cgactccatc gtcatcatcg aattagccaa cgccttgcgt 37920 
gagcgctatc cgagcttgga cgcgtcacag ctgatggaaa ccttatcgat cgaccggctg 37980 
gttgcccaat ggcaggcaac ggagcccgcc gtaccggcag agccaacagc ggaaccgccg 38040 
gtagccgacg aagacgccgc tgccatcatc ggactggccg gccgctttcc aggcgcggac 38100 
acgttggagg agttctggaa caacctgcgc aacggccaaa gcagtatggg agaggtgcca 38160 
ggcgagcgct gggatcacca gcactacttc gacagtgaac gccaggcacc gggcaagacg 38220 
tatagccgct ggggtgcgtt tctgagggac atagacggct tcgatgcagc cttctttgaa 38280 
tggcccgaca gcgtcgcgct ggaatcggat ccgcaagcgc ggatatttct agagcaggcc 38340 
tatgccggga tcgaagatgc cggctacacg cctggctcgc tcagcaagag ccaacgcgta 38400 
ggtgtattcg taggtgtgat gaatggttac tacagcggcg gagcgcgctt ctggcaaatc 38460 



• 
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gccaaccgcg tgtcgtaeca gttcgatttt cgcgggccaa gcctggcggt ggataccgcc 38520 

tgttcggctt cgctcaccgc gatccacctg gcgctggaaa gcctgcgcag cggcagttgc 38580 

gaggtcgcac tggccggtgg cgtgaatctg ctggtcgatc cgcagcaata tcttaatttg 38640 

) 

gctggcgccg cgatgctctc cgccggcgcc agctgtcggc cgttcggcga ggccgcggac 38700 

ggtttcgtgg ccggcgaagc ctgcggcgtg gtgctgctca agccgctcaa gcaagcgagg 38760 

gccgatggcg atgtgatcca tgccgtaatc aggggcagca tgatcaatgc cggtgggcac 38820 

accagcgcgt tctcctcgcc taaccctgcc gcccaggccg aagtcgtgcg gcaggccttg 38880 

15 cagcgcgcgg gcgtggcgcc cgattcgatc agctacatcg aggcgcatgg caccggcacc 38940 

gtactaggcg atgcagtgga gttgggtgct ttgaataaag tgttcgacaa gcgcgcggcg 3 90 DO 

ccatgcccga tcggctcgct gaaggcgaac atcggccatg ccgaaagcgc cgcgggcatc 39060 

gccggcctgg ccaagctggt attgcagttc aggcatggcg agttggtgcc tagtctgaat 39120 

gcgtttccct tgaatcccta tattgagttc ggtcgcctcc aggtacaaca gcagccggca 39180 

25 ccgtggccgc gccgtggcgc ccagccgcgg cgcgccgggt tatctgcctt cggtgctggc 39240 

ggatcgaatg cgcacctagt ggtagaggaa gctccggcta tggctcccgg ggtctcgatc 39300 

agcgccagct ctccagcctt gatcgtgctt tcggcgcgaa cgctgcctgc cttgcaacag 39360 

cgtgctcgcg atctgctcgt ctggatgcaa gcgcggcagg tggatgacgt catgctggcc 39420 

gacgttgctt atacgctgca cttgggccgc gtcgcgatgg agcaacgcct ggcttttacc 39480 

gctggctcgg ctgccgagtt gagcgagaaa ttacaggctt acctgggcca tgcgattcgg 39540 
gccgacatct atctgagcga ggacacgccc ggcaaaccgg caggcgctcc gatcgtggcc 39600 
gaggaagatc tgctcacgct gatggatgcc tggatcgaaa agggccagta cggtcgtttg 39660 
ctggagtact ggaccaaggg ccaaccgatc gactggaaca aactctattg gcgcaagctg 39720 
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tatgcggacg gacggccgcg gcggatcagc ctgcccacct atccgttcga gcaccggcgt 39780 

tattggcaaa cgccggtgcc gggcgagcga agcctgcacg ccaccgcgcc agctactcgg 39840 

gaaacggttg cggttggtgc catgccggat ccggccggcg ctacggtgca agcccggttg 39900 

tgcgccttgt gccaagtgtt gttgggcaaa ccggtcacgg cccagatgga tttctttgcc 39960 

gtcggcggcc attcggtgct ggcgatccaa ttggtctcgc gcatccgcaa aagcttcggg 40020 

gtggagtatc cggtcagcgc tttgttcgaa tcggcgctgt cgtcggacat ggcgcggcag 40080 

atcgaacaat tgcgggtgaa cggagtcgcc aagcgcatgc cggcgttgtt gcctgccggg 40140 

cgcgtgggcg cgattcctgc gacttatgca caggagcgcc tatggctcgt ccacgaacat 40200 

atgagtgagc aacgcagtag ttacaacatc acctttgcca tgcacttcag aggcgtcgac 40260 

ttccgtgctg aagcgatgcg tgccgcattg aacgcgctgg tggtgcggca cgaagtgctg 40320 



cgcacacgct ttctttcgga ggacgggcag ctgcaacagg tgatcgctgc ctcgttgacg 40380 

ttggaggtgc cggtaagaga gatgtcggtc gaggaggtcg acctgctgct ggccgcgagc 4 0440 

acgcgggaga ctttcgatct gcggcagggg cccttgttca aggcacgcat cctgcgcgtg 40500 

gcggccgatc accatgtggt gttgagcagc atccaccaca tcatttccga cggctggtcg 40560 

ctgggagtgt tcaaccgtga cctgcaccag ctgtacgagg cgtgtttgcg cggcacgccc 40620 

cccacactgc cgacgctggc ggtgcagtat gccgactacg cgctgtggca acggcaatgg 40680 

gagctggcgg ctccgctgtc gtactggacg cgggcactgg aaggctacga cgacggcctg 40740 

gacttgccct acgaccggcc gcgcggcgcc acgcgggcgt ggcgggcagg gctggtcaaa 40800 

caccgctatc cgccgcaact ggcccagcag ttggcggcct acagccaaca gtaccaagcg 40860 

acgctgttca tgagcctgct ggcaggcctg gcgttggtgc tgggccgtta cgccgatcgc 40920 

aaggacgtgt gcatcggcgc gacggtctcc ggccgcgacc agctggagct ggaagagctg 40980 
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atcggctttt tcatcaatat tttgccgctg cgggtggacc tgtcggggga tccgtgcctg 41040 
gaggaggtgc tgctgcgcac gcgtcaagtg gtactggatg gcttcgcgca ccagtcggtg 41100 
ccgttcgagc acgtgttgca ggcgctgcgg cgtcagcgcg acagtagcca gatcccgctg 41160 
gtgccggtga tgctgcgaca ccagaacttc ccgacgcagg agattggcga ttggcccgag 41220 
ggagtgcggc tgacgcagat ggagctgggg ctggaccgta gcacgccgag cgagctggat 41280 
tggcagttct acggcgacgg cagctcgctg gagctgacgc tggaatacgc gcaggacctc 41340 
ttcgacgaag cgacggtgcg gcggatgatc gcacaccacc agcaggcgtt ggaggcgatg 41400 
gtgagccggc cacagctgcg ggtgggcaag tgggacatgc tgacggccga agagcgccgg 41460 
ctgtttgccg cgctaaatgc gacaggtacg ccacgggagt ggcccagtct ggcgcagcag 41520 
ttcgaacggc aggcgcaggc gacgccgcag gccatcgcgt gcgtgagcga tgggcagtcg 41580 
tggagctatg cgcagttgga ggcgcgcgcc aaccagctgg cacaggcgct gcgggggcag 41640 
ggcgcgggcc gggacgtgcg ggtggcggta cagagtgcgc gcacgccgga actgctgatg 41700 
gccttgctgg cgatctttaa ggccggtgcg tgctatgtgc cgatcgatcc ggcctacccg 41760 
gcggccbacc gcgagcagat cctggccgag gtgcaggtgt cgatcgtgct ggagcaagac 41820 
gagctggcgc tggacgagca agggcagttc cacaatccgc gttggcgcga gcaagccccg 41880 
acgccgctgg ggctgaggga gcatccgggc gacctggcgt gcgtgatggt gacctccggc 41940 
tcgaccggcc ggcccaaggg cgtgatggtg ccgtatgcgc agctgtacaa ctggctgcat 42000 
gcaggctggc agcgtitctcc gctcgaggcc ggggagcggg tgctgcagaa gacctcgatc 42060 
gcctttgcgg tgtcggtaaa ggagttgcta agcgggctgc tggcgggggt ggaacaggtg 42120 
atgctgccgg acgagcaggt gaaggacagc ctggcgttgg cgcgggcgat tgagcaatgg 42180 
caggtgacgc ggctgtacct agtgccatcg cacctgcagg cgctgctgga cgcgacgcaa 42240 
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ggacgagacg ggctactgca ctcgctgcgt cacgtggtga cggcggggga agcgttgccg 42300 

tctgcggtgc gcgaaacggt gcgggcgcgt ctgccacagg tgcagctatg gaacaactac 42360 

ggctgcacgg aactgaacga cgcgacctac caccggtcgg atacggtggc gccaggaacg 42420 

tttgtgccga tcggcgcacc gatcgccaac accgaggtat acgtgctgga ccggcagctg 42460 

cggcaggtgc cgatcggggt gatgggcgag ctgcacgtac acagcgtggg gatggcgcgc 42540 

ggctactgga accggccggg gctgacggcc tcgcgcttca tcgcgcaccc gtatagcgag 42600 

gagccgggca cacggctgta caagaccggt gacatggtac gccggctggc ggacgggacg 42660 

ctggaatacc tgggccgaca ggacttcgag gtcaaggtgc gcggccaccg ggtggatacg 42720 

cggcaggtgg aggcggcctt gcgggcgcag cccgcggtgg ccgaggcggt ggtgagcggt 42780 

caccgggtgg acggggacat gcagttggtg gcctatgtgg tggcgcgtga agggcaggca 42840 

ccgagcgcgg gcgagttgaa acaacagctg tcggcgcagt tgccgaccta catgctgccg 42900 

accgtgtacc agtggctgga gcagttgccg cggctgtcca acggcaagtt ggaccggttg 42960 

gccctgccgg caccgcaggc ggtacacgcg caggagtacg tcgcgccacg caaccaggcc 43020 

gagcaacggc tggcggcact gtttgccgag gtgctgcggg tggagcaggt aggcatccac 43080 

gacaacttct tcgccttggg tgggcactcg ctgtctgcat cgcaactgat ctcgcgtatt 43140 

gccagggata tggcgatcga tctgcccctg gccatgctgt tcgagctgcc cacggtagcg 43200 
cagcttagcg aatccctcgc cagccatgca cgcgacagcg attacgatgt catccccgca 43260 
agcaccgagg aggcgaccat tccgctttcc actgcgcagg agcgcatgtg gttcctgcac 43320 
aagttcgtgc aggagacgcc gtacaacacc ccgggtctcg ccttattgca aggcgaactg 43380 
gacatttcgg ccttgcaggt agcatttcgc tgtgtgccag aacggcacgc cgtgctgcgt 43440 
acccatttcg tggaaaccga gcagcaatgc gtacaggtca ttggcgcagc agagcagttc 43500 
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gtgctgcagc ttaggtcgat tcgcgacgag gctgatctgc atggcctatt gcacacagcc 43560 
gtcagcgaac ccttcgattt agaacgcgag ctgccattgc gcgccctgct gtatcgcctg 43620 
gacgaccggc ggcattacct agcagtggtc atccatcaca tcgtcttcga cggctggtcg 43680 
acctcaatcc tgtttcgtga gctggccacg cactatgctg catgccgcca tggccaatcc 43740 
gcgcctttgc caccgctgga gcttagctat gccgattacg cacgctggga gcgtgcgagg 43800 
ctgaaccagg aagacgcgct gcgcaagctc gaatattgga aaacgcagct tgccgatgca 43860 
ccgccgctgg tgttgcccac gacctatgcg cggccggttt tccagaactt caatggcgcg 43920 
actgtggcgc ttcagatcga gccgccgctg ctgcaacgcc tgcagcgttt cgccgacgca 43980 
cacagcttta cattgtacat gctacttctg gcagcactgg gcgtcgtatt gtcgcgccat 44040 
gcccggcaga agcatttctg cattggcagt ccggtcgcca atcgcgcccg agccgagttg 44100 
cacggtbtga tcggtttgtt cgtcaacacc ctggcggtac ggctcgattt ggacggcaat 44160 
cccagcgtgc gcgagctgct cgaacgcatc cactgcacca cgctggccgc ctacgagcac 44220 
caggatgtgc cgttcgaaag aatcgtggaa agcctgaagg taccgcgcga taccgcgcgt 44280 
aacccgctgg ggcaggtgat gctcaatttc cagaacatgc caatgtcggc gttcgacctg 44340 
gatggtgtcc aggtgcaggt gctccccatg cacaacggca cggccaagtg cgagctgacc 44400 
ttcgacctgc tgctggatgg ctcacgccta tccggtttcg tcgaatacgc cactgggctg 44460 
tttgcgccgg aatgggtcca ggcgctggta cagcaattca agtgtgtgct ggcggcattg 44520 
gtggaacggc cggaggcatc gctgaatgat ttgcccatgg cgcccaacga ggcgcaaccg 44580 
gcgtcgccgg cattgatgaa gcatgtcgcg ccgagcttgc ccaacttact tgaggctatg 44640 
gcggccaatg atgccgcacg cctcgccttg caagcgccgg aaggtgcgct cagttacgct 44700 
cagccaatcg aggcagcaaa cgagttcgcc tggcgtttgc ggtgcgagca cgccggtccg 44760 
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gacaaagtcg ttgccctgtg cctagcgcct tgctccgcct tggtggttgc tttgctggcc 44820 

gcttcattat gcggtgcggc gagcgtgctg atcgatccga cgacgactgc cgaggcgcaa 44880 

tacgaccagt tgttcgaaac gcgggccggc atcgtggtga cctgttctag cttgctggag 44940 

aagtugccgc tcgacgacca ggctgtagtg ctgatcgacg agcaagctgc agaagcgacg 45000 

ccgcgtttga tgcatttcac cgacgatcca gctttgcccg caaCgctgta ttgtgtgtgt 45060 

gacgaaaagg ggcgaacccg cacgatcatg gtcgaaagcg gcagtttgtc gagtcgcctg 45120 

ctcgatagcg tgcagcgttt cagtctcgaa cgcaccgatc gcttcctgct gcgcagcccg 45180 

ctttctgccg aactggcgaa taccgaagta ctgcaatggt tggcggcagg cggcagcctc 45240 

agcatcgcac ccatgcatgg cgatttcgat gccgctgcct ggctggagac cctcgcgacg 45300 

tacgcgatca ccgtggccta cctggctcaa gttgaattga ccgagatgct ggcgcatctg 45360 

caaaaccatc ctcttgagcg caacaagctg gccggcttac gcgtgctggt ggtgcatggc 45420 

gcgcccttgc cgatcgcgcc actgatgcgc ctagacgcgt ggttgcgaga ggtgggcggt 45480 

tccgcacgga tcttcgccgc ctacgggaat gccgagttcg gtgccgaaat attgagccag 45540 

gatgtcagcg ctgcattgca agcgggtatt ggcgctcaat acaagcatcg ccgtggtctg 45600 

ttcccgtitgg gtgccaactc gatgtgtcac gtggtgcaga gcaacggccg catcgcgccc 45660 

gacggcatgg ttggtgaatt gtggatcaca cagccagcct gcttgtacaa aaccgatgca 45720 

ttggtgcgtc gcctggcaaa tgggcaactg gaatggttgg gctccctcga tgtccagtcg 45780 

cgtatcgatg atccccgcat cgatctgtgc gtcgtggagg cacaactgcg cttgtgcgaa 45840 

gacgtcggcg aagcggtagt gctgtatgag ccgtitgaagc gctgcttggt agcctatctc 45900 

tcggcccgta gcacagctgc aatcatgacc gacgagacgc tggccaggat ccgccaggcc 45960 

ctgagcgaaa ccttgccgga ttatctactg cctgcaatct gggtgccgct cgcgcactgg 46020 
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ccacgcttac cccatgggcg ggtcgacctc ggcgccttgc ctgcaccgga tttcgatctt 46080 

gcgcggcatg agtcgtacat agcgccacgc acagccgtcg aacaggccgt ggccgaaata 46140 

tggcaacgcg tgttgaagcg tacccaggtc ggcgtgcatg acaatttctt cgagctgggc 46200 

ggccattcgg tgctggcgat ccagctggtg tccggcttgc gcaaggcttt ggccatcgaa 46260 

gtgccggtca ccctggtgtt cgaggcgccg atactggggg cgctggcgcg gcagatcgcc 46320 

cccttgttgg tcagcgaacg gcgtccgcgc ccgcctggcc tgacgcgcct ggagcataca 46380 

gggccgattc cggcttcgta tgcacaggag cggttatggc tggtgcacga gcatatggag 46440 

gagcagcgaa ccagctacaa catcagtaac gcagcgcatt tcatcggagc agccttcagc 46500 

gtcgaagcga tgcgtgccgc attgaacgcg ctggtggcgc ggcacgaagt gctgcgcaca 46560 

cgctttcttt cggaggacgg gcagctgcaa caggtgatcg ctgcctcgtt gacgctggag 46620 

gtgccggtac gcgaggtgtc ggccgaggag gtcgacctgc tgctggccgc gagcacgcgg 46680 

gagactttcg atctgcggca ggggcccttg ttcaaggcac gcatcctgcg cgtggcggcc 46740 

gatcaccatg tggtgttgag cagcatccac cacatcattt ccgacggctg gtcgctggga 46800 

gtgttcaacc gtgacctgca ccagctgtac gaggcgtgtt tgcgcggcac gccccccaca 46860 

ctgccgacgc tggcggtgca gtatgccgac tacgcgctgt ggcaacggca atgggagctg 46920 

gcggctccgc tgtcgtactg gacgcgggca ctggaaggct acgacgacgg cctggacttg 46980 

ccctacgacc ggccgcgcgg cgccacgcgg gcgtggcggg cagggctggt caaacaccgc 47040 

tatccgccgc aactggccca gcagttggcg gcctacagcc aacagtacca agcgacgctg 47100 

ttcatgagcc tgctggcagg cctggcgttg gtgctgggcc gttacgccga tcgcaaggac 47160 

gtgtgcatcg gcgcgacggt cbccggccgc gaccagctgg agctggaaga gctgatcggc 47220 

tttttcatca atattttgcc gctgcgggtg gacctgtcgg gggatccgtg cctggaggag 47280 
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gtgctgctgc gcacgcgtca agtiggtactg gatggcttcg cgcaccagtc ggtgccgttc 47340 

gagcacgtgt tgcaggcgct gcggcgtcag cgcgacagta gccagatccc gctggtgccg 47400 

gtgatgctgc gacaccagaa cttcccgacg caggagattg gcgattggcc cgagggagtg 47460 

cggctgacgc agatggagct ggggctggac cgtagcacgc cgagcgagct ggattggcag 47520 

ttctacggcg acggcagctc gctggagctg acgctggaat: acgcgcagga cctcttcgac 47580 

gaagcgacgg tgcggcggat gatcgcacac caccagcagg cgttggaggc gatggtgagc 47640 

cggccacagc tgcgggtggg caagtgggac atgctgacgg ccgaagagcg ccggctgttt 47700 

gccgcgctaa atgcgacagg tacgccacgg gagtggccca gtctggcgca gcagttcgaa 47760 

cggcaggcgc aggcgacgcc gcaggccata gcatgcgtga gcgatgggca gtcgtggagc 47820 

tatgcgcagt tggaggcgcg cgccaaccag ctggcacagg cgctgcgtgg gcagggcgcg 47880 

ggccgggacg tgcgggtggc ggtacagagt gcgcgcacgc cggaactgct gatggccttg 47940 

ctggcgatct tcaaggccgg tgcatgctat gtgccgatcg atccggccta cccggcggcc 46000 

taccgcgagc aaatcctggc cgaggtgcag gtgtcgatcg tgctggagca aggcgagctg 48060 

gcgctggacg agcaagggca gttccgcaat cggcgttggc gcgagcaagc cccgacgccg 48120 

ctggggctga ggggacatcc gggcgacctg gcgtgcgtga tggtgacctc cggctcgacc 48180 

ggccggccca agggcgtgat ggtgccgtat gcgcagctgc acaactggct gcatgcaggc 48240 

tggcagcgtt ctgcgttcga ggccggggag cgggtgctgc agaagacctc gatcgccttt 48300 

gcggtgtcgg taaaggagtt gctaagcggg ctgctggcgg gggtggggca ggtgatgctg 48360 

ccggacgagc aggtgaagga cagcctggcg ttggcgcggg cgatcgagca atggcaggtg 48420 

acgcggctgt acctagtigcc gtcgcacctg caggcgctgc tggacgcgac gcaaggacgc 48480 

gacgggctac tgcactcgct gcgtcacgtg gtgacggcgg gggaagcgtt gccgtcggcg 48540 
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gtgggcgaag cggtgcgggt gcgcctgcca caggtgcagc tatggaacaa ctatggctgc 48600 

acggaactga acgacgcgac ctaccatcgg tcggatacgg tggcgccagg aacgtttgtg 48660 

ccgatcggcg caccgatcgc caacaccgag gtatacgtgc tggaccggca gctgcggcag 46720 

gtgccgatcg gggtgatggg cgagctgcac gtacacagcg tggggatggc gcgcggctac 48780 

tggaaccggc cggggctgac ggcctcgcgc tbcatcgcgc acccgtatag cgaggagccg 48840 

ggcacacggc tgtacaagac cggtgatatg gtacgccggc tggcggacgg gacgctggaa 48900 

tacctgggcc gacaggactt cgaggtcaag gtgcgcggcc accgggtgga tacgcggcag 48960 

15 gtggaggcgg ccttgcgggc gcagcccgcg gtggccgagg cggtggtgag cggtcaccgg 49020 

gtggacgggg acatgcagtt ggtggcctat gtggtggcgc gtgaagggca ggcaccgagc 49080 

gcgggcgagt tgaaacaaca gctgtcggcg cagttgccga cctacatgct gccgaccgtg 49140 

20 . 

taccagtggc tggagcagtt gccgcggctg tccaacggca agttggaccg gttggcgctg 49200 

ccggcgccgc aggtggtaca cgcgcaggag tacgtcgcgc cacgcaacga ggccgagcaa 49260 

25 cggctggcgg cactgtttgc cgaggtgctg cgggtggagc aggtgggcat ccacgacaac 49320 

ttcttcgcct tgggtgggca ctcgctgtct gcatcgcaac tgatctcgcg catccgccaa 49380 

agttttcacg tcgatctgcc gctgagccgg atcttcgagg cacccacgat cgagggcctg 49440 

30 

gtcaggcagc tagcgttgcc tagtgaaggc ggcgtggcca gcatcgccag ggtagcgcga 49500 

aaccggacga tcccattgtc gctgttccag gaacgcctgt ggttcgtgca ccaacacatg 49560 

35 cctgagcaac gcaccagtta caacggcacg ctcgccttgc gtttgcgtgg tcctttgtcg 49620 

gtggaagcga tgcgtgcagc gctgcgtgcg ttagtgctgc gccacgaaat cttgcgtacc 49680 

cgcttcgtgt tgccgaccgg tgctagcgag ccggtgcagg tcattgacga gcacagcgat 49740 

ttccagctct cagtacagct agtcgaggat actgagatcg cgtcgctgat ggatgaactg 49800 
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gcaagtcata tctacgactt agccaacggc ccgctgttca ttgcatgcct tttgcaactg 49860 
gatgagcaag aacatgtgct gctaatcggc atgcatcacc ttatctacga cgcttggtcg 49920 
caattcaccg tgatgaaccg cgatctacgc gtgctgtatc accgccacct cggacttgcc 49980 
ggcggagatc fcgccggaatt accgatccaa tatgccgact atgcgatctg gcaacgcgcc 50040 
cagaacctgg acgcgcaact ggcctattgg caggctatgt tgcacgacta cgacgacggc 50100 
ctggagctgc cctacgacta tccgcgtccg cgcaatcgca cctggcacgc agcggtctac 50160 
acacacacct atccggctga actggtacag cgctttgccg gcttcgtaca ggcgcatcag 50220 
tcgaccttgt tcatcgggct gttggccagc ttcgcggtcg tgttgaacaa atacaccggc 50280 
cgggacgact tgtgcatcgg taccaccacg gcagggcgca cgcacctgga gctggagaac 50340 
ctgatcggtt tcttcatcaa catcttgcct ttgcgcttgc gcttggacgg cgatccggac 50400 
gttgccgaaa tcatgcggcg aacacggttg gtggcgatga gcgcgtttga gaaccaggcg 50460 
ctaccgttcg agcacctgct caacgccctg cacaagcaac gtgacaccag ccggattccg 50520 
ctagttccgg tggtgatgcg tcatcagaac ttcccggaca cgatcggcga ctggagcgat 50580 
ggcatccgta ccgaagtgat ccagcgcgat ctgcgtgcca cccccaatga aatggacctg 50640 
caactcttcg gcgacggtac ggggctttcg gtcacagtgg aatacgcggc ggagctgttc 50700 
tcagaagcga ccattcgccg cetgatccac catcaccaac tcgtcctgga gcagatgttg 50760 
gcggcccatg aaagcgccac gtgccccttg gatgttgccg actagcaaaa gccggccgcc 50820 
gtcacccgtt catcgatagc gagggcaatc atggattcag cgttacctac atctgcattt 50880 
accttcgatc tcttttacac cacggttaac gcctactatc gcactgccgc agtcaaggcg 50940 
gcgatcgaac tggggctatt cgatgtggtg gggcagcagg gccgaactcc cgcagccatc 51000 
gccgaggcct gccaggcgtc gccgcgcggc attcgcatcc tttgctatta cctagtatcg 51060 
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atcggttttc tacgccgcaa cggtggcctg ttctacatag atcgcaacat ggccatgtac 51120 

ctggatcgta gttcgcccgg ctacctgggt ggcagcatca agttcctgct ctcgccctac 51180 

atcatgagcg ccttcaccga tctgaccgcc gtagtcagga ccggcaagat caacctggcg 51240 

caggacggcg tggtggcacc ggatcacccg cagtgggtgg aatttgcacg cgcgatggca 51300 

ccgatgatgg cgctgccctc ggcgttgatc gccaatatgg tgtcgttgcc cgctgatcgg 51360 

ccgattcgtg tgctggacgt, ggcagccggc cacggcctgt tcggcatcgc cttcgcgcag 51420 

cgcttccgcc aggctgaagt gagcttcctg gactgggaca acgtgctaga cgtagcacgc 51480 

15 gaaaacgccc aggcggccaa agtggccgag cgagcgcgtt tcctgcccgg caacgcattc 51540 

gacctcgatt acggcagcgg ctacgacgtg atcttgttga ccaacttcct gcaccatttc 51600 

gatgaggtcg atggcgagcg catcttggct aagacgcgcg atgcgctgaa cgacgacggc 51660 

atggtgatca ctttcgaatt catcgccgac gaagagcgtt cctcaccgcc gctggccgcc 51720 

accttcagca tgatgatgct gggcaccacc ccggcgggcg agtcctacac ctatagcgat 51780 

ctggaaagga tgtttcggca tgccggcttc ggccacgtgg aactaaaatc gataccgccg 51840 

gccttgctga aagtggtggt ttcccgcaag agggccccat aatgatcgaa tcggcgacat 51900 

cccctgtggc gaaaaccgag cgcatctggt gcaccgagct ggacctggat gcactcaacg 51960 

ccatgtcggc caacacgatg caggccctgc tcggtatacg catgatcgag atcggctcgg 52020 

actatctggt ctcctgcatg tcggtggact ggcgttgcca ccagccctat ggggtattgc 52080 

35 atggcggcgc atcggtcacc ctggccgagg ctaccggcag catggcggcc tccatgtgcg 52140 

tgccggccgg ccaacgttgc gttggcctag acatcaatgc caaccacatc gcgagcatct 52200 

ccagtggcca agtacagtgc atcgcgcggc cgctgcacat aggggccttg acccaggtat 52260 

ggcagatgcg catctatgac gaaggtgacc gcacgatctg cgtgtcgcgc ctgaccatgg 52320 
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cggtattatc ggtgcacgtc gcgcgcgtat ccccgaatcc agccagcagc ggagtccaga 52380 



gccagcctgc gttcaccaat ggtttgaagc gcaggtgagt tcgacaccgg atgcgcctgc 52500 

tgccttctta ggcgagcgtc gaatgagtta tggccagctc aacacccgcg ccaatcggct 52560 

tgcacggctg ttgcagtcac agggcgttgg gcctggtgcc cgggtcgcgg tgtggatgaa 52620 

tcgcagcccc gaatgcctgg ccgctttgct ggcggtcatg aaggccgggg cagcttatgt 52680 

accgatcgac ctgagcctgc cgatccgacg tgtccaatac atcttgcagg acagccaggc 52740 

ccggctcgta ctggtcgatg acgaagggca aggccgcctg gacgaacttg agctgggcgc 52800 

gatgactgcc gtcgatgtct gcggcactct ggacggcgac gaggcgaatc tggacctgcc 52860 

ttgcgatccg gcgcagccgg tttattgcat ctatacctcc ggctccacag gtagccccaa 52920 

gggcgtgctg gtacggcaca gcgggttggc taactacgtg gcctgggcta agcggcaata 52980 

cgttacggct gacacgacga gtttcgcctt tbactccbcg ctgtcgttcg atctgaccgt 53040 

cacctcgatc tacgtgcccc tggtggctgg cctgtgtgtg catgtgtacc cggagcaggg 53100 

cgacgacgtg . ccggtaatca accgcgtgct ggacgacaac caagtagacg tgatcaagct 53160 

gacaccctcg cacatgctga tgctgcgcaa cgcggcactg gcgacgtctc ggctgaagac 53220 

gctgatcgtg ggtggcgagg acctgaaagc ggcggbggcg tacgacatcc atcagcggtt 53280 

ccgccgcgat gtggcgatct acaacgaata cggtcctacc gaaaccgtag tggggtgcgc 53340 

gatccatcgt tacgatccgg cgaccgaacg cgaaggctcg gtgccgattg gtgtgccgat 53400 

cgatcacacc agcctccacc tgctcgatga acgtctgcag ccggtcgcac cgggcgaggt 53460 

cggccagabc cacatcggtg gcgcgggcgt ggccatcggc tatgtgaaca agccggagat 53520 

caccgatgcg caattcattg acaatccctt cgaaggcagc ggccggcttt acgccagtgg 53580 



cgtgaacgaa actgcaactg taaccaaggc taccctcagt ccagcgaagg cgagtataac 
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cgacctagga cgcatgcgtg ccgacggtaa gcttgaattc cttggccgca aggattcgca 53640 

gatcaagctg cgcggctacc gcatcgaact gggcgagata gagaacgttc tgcttggcca 53700 

cgcagccttg cgcgaatgca tcgtggatac caccgtggcg ccgcgccgcg actatgacag 53760 

caagagcttg cgctattgcg cgcgttgcgg tatcgcttca aatttcccca ataccagctt 53820 

cgacgagcac ggtgtctgca accattgcca cgcctacgac aaataccgga acgtggtcga 53880 

ggattatttc cggaccgaag atgagctacg tactatcttc gagcaggtca aggcgcacaa 53940 

caggctccgc tacgactgcc tggtggcttt cagcggcggc aaggacagca cctatgcgct 54000 

atgccgcgta gtggacatgg gcctgcgcgt gttggcgtac accctggaca atggctacat 54060 

ctccgacgag gccaaggcaa acgtcgaccg cgtcgtgcgc gagctggggg tggaccatcg 54120 

ctatctgggt actccacaca tgaacgccat cttcgtggac agcctgcatc gccacagcaa 54180 

cgtctgcaac ggctgcttca agaccatcta tacgctgggt atcaacctgg cgcacgaagt 54240 

gggcgtaagc gacattgtaa tgggcctgtc caaaggacag ctgttcgaga cgcgcctgtc 54300 

tgagctgttt cgcgccagca ccttcgacaa ccaggtattt gagaagaacc tgatggaggc 54360 

gcgcaagatc taccatcgca tcgacgacgc ggcggcccgc ctgctggaca cctcttgcgt 54420 

gcgcaacgat cgcttgctcg aaagtacgcg tttcatcgac ttctaccgct: actgcagtgt 54480 

cagccgcaag gacatgtatc gctatatcgc cgagcgcgta ggctggagcc gtccggctga 54540 

caccggccgc tcgactaact gcctgctcaa cgatgtgggc atctacatgc acaagaagca 54600 

35 acgtggctat cacaactatt cgttgcccta cagttgggac gtgcgggtag gccatatccc 54660 

aagggaagac gcgatgcgcg agctggagga caccgacgat atagacgagg ccaaggtact 54720 

gggcctgctc aagcagaticg gctatgactc aagcctgatc gatacccagg cgggcgatgc 54780 

gcagctgatc gcctactacg tggcggcgga ggaactggat ccggtggcat tgcgcaattt 54840 
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tgctgctgcg atcttgcccg agtacatgct gccttcgtat ttcgtgcggc tggaccgaat 
gccgttgacg ccgaatggca aggtgaaccg ccgagcattg ccgaggccgg agttgaagaa 
gaacgccagc gaggcgcata ccgagccgag cagtgcgcta gagcaggaac tggtgcaaat 
ctggaaagag gtgctgatgg tcgacaaggt cggcgtcagg gacaactttt tcgagctggg 
cggccactcg ctgagcgcgc tgatgttgct ctacagcata gccgagcgct accagaagat 
ggtcagcatc caggcattct cggttaatcc gaccatcgaa ggtctgtcgg agcat:ctggt 
cgcataaaag ggcaaggacc tcagcgctgt cctctgcatg gtgcggcgtt ggggttcgca 
cgccattcaa tatgattgtc atcacttatc gtgagtacgc taatcatagg gcagggcgcc 
gacacaggag gcattcacgg cgctgccgac gcacttgtgc cgccggcgaa cgcccgactg 
tggagcgcat ccgtgcccac ctaggcactg ggtctccgaa caccgtggtg cgctggctgg 
atacctggtg gcaaggcttg ggagatcgaa tcgccaacgt gcctgaagca gtctccgcac 
tggtagggca gtggtggacg ctggctttgg atcatgcgcg aagccatgcc ggtgaggcca 
tcgctgctga gcgcacagcc ctccaagacg cacgcagcag cttggagggc gaccgccatg 
gctgcaggcc gagctcgctc agctacgggg tgagaccgaa gctgcccact aaacggaaca 
gctcgcgacg acccgagcta ttgagct:gga gcgcctggtc gagcaactcc aacgccagat 
taatga^atg gagcggcagc gtgacacggc ggtgcagcgc atcaccgagg ctgaggaggc 
acgggaggtg ctccggaggc agtacgtgaa aatttgatc 

<210> 2 

<211> 2986 

<212> DNA 

<213> Xanthomonas albilineans 



54900 
54960 
55020 
550B0 
55140 
55200 
55260 
55320 
55380 
55440 
55500 
55560 
55620 
55680 
55740 
55800 
55839 



<400> 2 
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gaattcagcg 
gcggccgtaa 
ctgcagaaga 
ccgcaacgcg 
aaccgacacc 
ctcggatcaa 
gaccgccgtg 
gatcttgccg 
tttgacgccc 
cctgctcctt 
tcgccatcgt 
tccttgctct 
tgacgcagca 
tccactt:cga 
aactgtcggc 
gacttgtagc 
acgatttcgt 
agccccccgc 
aagtatggca 
cgcgtcgtcc 
gcggcgggta 
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Application of Royer, et al. 



cccggcttga 
atcaattgca 
tgctgtccga 
cggcggttgc 
acgtcgtggt 
gccaactgac 
ttctcggtga 
tcggcggtga 



cgtcgacctt 
tggcgttact 
ataggacggg 
ttcggtcgcc 
tgtccgtgta 
gctggtactt 
ggtactggcg 
atttgcagaa 



gtcgccgagg 
ccgttattcg 
attggatcgc 
gcacacggga 
ogggatcagc 
ggacttggtg 
cagggtgttg 
cttgcgacga 



ccctgcggca ccaggtagtb 
ccgcccaggt tggtgacttt 
ttagcggcgg catgcggcca 
gcggagCcgg acatgtaccc 
atgcgcgcgc ttccgaacga 
gccaggaaac gcgcacgctt 
ccggtcacgc ggctcggcac 
agatccttgt agtcgatctc 
cggaagaact tggacatgga 



aggcggcttc gacggcgtcg ccgtcggctt cgttggcggc ggcggacaca 



cgtcgtcgcg acgacgacgc tcaccacggt 
tcatgatcag cgactgctcg gtgtcggcgc 
cggcgtcgtt gaagcggaag ctctcgacca 
tgttgagcat gacgkagtgc gccttcacca 
ggccccagtc ttccaggcgg tggatggtgc 
gctcgatcat ggcggggacc tgctcgctct 
aatgacgact catgtggttg tacctttcgg 
aggtggcggt ggagcaaggg ttcccgcccg 
gcgcccttga ccaatgacaa gctcatgcac 
atcgccattg cgcccctccc cgaccccaag 
ggcgcgactc tgcgacacta gcgcaatgbt 



cgggcttgtc gcccttctcg 
catcgcgctt gatcgccagg 
actcgctcag cacggcctga 
gattctggat cgggtaggcc 
cgccgccgtt ctcgaccagc 
ggtccggatg gaccaggaac 
atgtggccca agggccagtc 
aataggcgca ggaagccaat 
ccaggacgcc cgctctgctc 
catcgaccaa aggaccgaat: 
atcgtcgaca ttgacgccca 



60 
120 
IBO 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
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cagccctcag cgcaacgcaa tgcccaatgc cgtaccgatg cagggcgcgc ggggactccc 1320 
gcagccgcaa gcgatgaacc cagggttgcc gagcgtcggc ggcttgagcg caggccagcc . 1380 

attgcagttg tcgttagcac cggaactgca ggcagc.cgcg cgcagtgccc accgccatct 1440 

gctcgacgac ggcacggcgc tttacctgct ggcgttcgat accgcgcaat tcgacccggg 1500 

ggctttcgcg gcaatggcaa tcgcccgccc ggacagcatc gcccgcagcg tgcgcaagcg 1560 

tcaggccgag ttcctgttcg gccgtctggc cgcgcgactg gcgctgcaag aggtgctggg 1620 

acctgcgcaa gcgcaggcag acattgcaat cggcgcgacg cgcgcgccct gctggcctgc 1680 

IS cggcagcctg ggcagcattt cccattgcga ggactacgcg gccgccatcg ccatiggcggc 1740 

cggcacccgc cacggcgtgg gcatcgatct ggaacgacca atcacacccg cgJjcgcgcgc 1800 

ggcgttgctg agcatcgcaa tcgatgccga cgaagccgct cgtctggcaa aggcggcaga 1860 

20 

cgcgcagtgg ccgcaagacc tgctgctgac cgcactattt tcggccaagg aaagcctgtt 1920 

caaagccgcc tacagcgcgg tcggacgcta cttcgacttc agcgcggcac gcctgtgcgg 1980 

25 catcgacctg gcacggcaat gcctgcatct gcgcctgacc gagacactct gcgcgcaatt 2040 

cgtggccggg caagtgtgcg aggtcggctt cgcgcgccta ccaccggacc tggtgctcac 2100 

ccactacgcc tggtgagcac gcggacagtc gaacccgcca acgccaacgg cactcaagac 2160 

30 

gtggcgtgcg ccgcgtcggt cgtgaagctc tccccgcagc cgcactcggc ggtggcattg 2220 

ggattgcgga acacgaaggt ctcacccaag ccctgcttgg cgaagtcgat ttcggtgcca 2280 

3S tcgaccaact gcagactggc ggcatcgaca taaatccgca ctccgtcctg ctcgaacacc 2340 

gcatcgtccg cgcgtgcctc gtgcgccaga tcggtgacat ggccccaacc ggaacagcct 2400 

gtgcgtacca ccccgaaacg tagacccagc gcaccgggag tctggtcgag gaaacgctgc 2460 

acgcgtgcaa acgcggcggg ggtgaggcgg atggccatga cgaacgactc caacgacttg 2520 



40 
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cgatacgaca ttatacgacc gatgcccgca acgcctcgca agcgctacgc tccagccagt 
acacttgttc attccatatc gagccactgc ggcgaggact caagtcatga cggtggtgag 
cgttgaacat gcgctggcag ggaagatccc ggtcggcggc gaagtgaccg tccgcggctg 
ggtccgtacc cggcgcgact ccaaagcggg gctgtccttc gtcaatgtca gcgacggctc 
Gtgcttcgcg ccgatccagg tggtggctcc ggccgcgctg cccaactacg aaccggaagt 
gaagcgcctg accgccggct gcgcggtgat cgcgcgcggg cacctggtcg cctcgcaagg 
ccagggccaa agcttcgaga tccaggccga gagcatcgag gtactgggct gggtcgagga 
cccggagacc tacccgatcc aacccaaagc gcattcgctc gaattc 

<210> 3 
<211> 9673 
<212> WA 

<213> Xanthomonas albilineans 
<400> 3 

gaattcggac ctggcgagta cttggaccgc gctgtgatgg tcaactgcca gggbggaagc 
ttcgctcccg gcatcgagat gacgttcgtc gtgcgcgatc cggcgctcta ccgcgccgac 
tggcaaagca gcggctgcgg accattccga atccgaggac gtgcgttgga ctacgccagc 
gtgcaatacg gccagccatt tctgagcgtt ggctatctcc cctaccaacc cggtccggat 
ggcatcgacc ccgcgccgct .ggagccaggc gacctctcca agtittatgtc gattccttgg 
caaaccgact acaacgcctg cgcgacgttc actgccgacc cgaatccgga caacagcacc 
acactttact gggcctggcc agcacagcgc ccattcaccg tgcatgtagc caccgatgtc 
agggacggca agccaggccc gcaacgctat tccattcgcg gtgcgggcac cacatccgac 
gatctgagta acgctggccg tttccaaaac catatcgaca tggtgaacaa ctggcaccgc 



2580 
2640 
2700 
2760 
2820 
2880 
2940 
2986 



60 
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300 
360 
' 420 
480 
540 
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atcggcttcg 
tacctggaag 
aacagcgcgg 
ccggcgggat 
atcgatgccg 
ctgctgccgc 
- atcggcaacg 
ccacacgggc 



tcatccaggg cagcgcgatc 
tggccagtca actggacgaa' 
acacttgaag catgaatacc 
gcgcggcagc catcgcattg 
gcgacggtca gcgcccgcgt 
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gacggcgata ttcgctacag 
ccggagatcg cgccctggcc 
cactgcgacg tggcggtcat 
cgccgtgccg gcgtcggcag 
tacggcgaga gcctgcctcc 



Applicatioii of Royer, et al. 



atgcactggg cgtggccgat accttcgctg cgctggacat 
cctcgtcatg gggagcgcaa 
ccggttggca ggtggatcgc 



acgttgggct acaacgactt 
cgcgatttcg atgcgttcct 



cccggacatg 
gatgaatgcc 
cggcgcaggg 
cgtggtcctg 
ggcgaccggc 
gcgcagatgc 
cctgttcgat 
gctgcaacag 



gacccggatg gtctgcgact gcgcttgcgc 
gcccgtttcg ccgtcgatgc cagcggccaa cgcgcacggc 
gaacgcgtca ccggtgatcg cttggtctgc cttgccgcgc 
agtcggctcg ggcaacgctc gctgctcgaa gcagtcgact 
ccgttgacgt ccggcgaagc catcgtcgtc ctcgccaccg 
cagcgtctgc aagaacccgc gcgatggtcg gtgcggttgg 
gccaccgtcg gctggatgcg cagggcacca ccatactgcc 



cacaaaccca ccatatcggc 



tgtccacgcg ccaggcgcgc 



tctgcacgat tgagccactg ctgtggagcg cgctggctgg caatcggcga tgcagcgtcc 

cataaggcati tggccgatgc gctcgcagcc 



agctacgatc cgctgtcctc acaaggcgtg 
gccccgcgca tttgcgccgt gctcgaacga 
cagcagatgc tggcgcgttt cgeiaggctac 
gaagcacgct ggctggatgc gccattctgg 



caccaagaca ctgcggtgat ggaacaggca 
cagcgcatgc gcgagcattt ctatcggcag 
cgccggcgcc gtgccgcgca aataccbgcc 



600 
660 
720 
780 
840 
900 
960 
1020 
1080 



gcgagcagcg gcggtaccgg cgtgcgcttg cacacccgcc tggagagtgc ctccgacgcg 

tcgcacggca cggcactaac cacgctgcat 1140 



tggcacaggc gctgggcgcc 1200 

tgttgccggt ggacaccacg 1260 

atggctggtg gtatgcggca 1320 
atgcggcggt cctgcgcgaa 



1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 





3 w J.O 
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ccgccccaca cggaccattc ccctataaaa acccaccgca tgaacgtatc gccttgagcg 1860 

atcatcaata caaccaaata cccacgacat aggttcatgc tgggacgacg acactgcacc 1920 

tgaaccacat gccgatgttc ggctagcagg accgcatcct gaacagcgct gccgcccgcg 1980 

cagcacgcaa gcccgcccgg atgcacgacg tgcagcgcgg caccaggatc gccgacgcgc 2040 

gcatcgcgtg gtcagtgcgc cggcattgcc agttgcacct cggcactacc gaccttgttg 2100 

cgcgggtcgc tggcggcacc tagctcatcg cggcgtttgt cccattccac cgtctgcagg 2160 

ttgccccaga catggctgga accgcgcccg tccttggcgc tgtcgccagg caacttgagc 2220 

ggatggtcca tcgccttcag cccctgcacg gtggctgcat cgaaggtacc ggtctcggcc 2280 

tcgatcacgt ccggcaacca ctggtggtgg tagcgcttca gcgcggcgac ctgttgcgga 2340 

tccagaccgt cgtcgtagcc gaggatgcca agcagaacca tggtgatgat acgactaccg 2400 

cccggagtac cgagcacgat cgccttgtcc gcgttctcca tgaaggtcgg cgtcatcgag 2460 

ctgagcgggc gcttgcccgg tttgggtgca ttggccgcat agcccatcac cccgaatacg 2520 

ttgggcgtac ccggacgcaa ggcgaagtcg tccatctcat cgttgagcag cacgccggtg 2580 

cccttgggga tcagtcccga gccgtacaac agattgaccg tctgggtggc gccgacacga 2640 

ttgccttcac ggtcgatgat cgaaaaatgc gtggtctcat cgtcttccag cggggtcggc 2700 

tgacccgaca acaggtcgct gggcgtggcc ttgtccgggt tgatggtcga acgcaggccc 2760 

accgcatagb ccttgctcaa taaaatgcgc tgcggcacaa cggtgaaatc cgggtcgccc 2820 

aggaagaagg tgcggtcacg gtaggcacgg cgcatcgctt ccacggtcag atggatccgg 2880 

tgcaccgggt ccatcgcctt gagatcgtag gcttccagga tctgcagcat gctiggccagc 2940 

gcaatgccgc cggaggatgg tggcggcgcc gtggtgatcg tccagccctt gtagtcgaag 3000 

cggatcggcg tgcgctgctt gaccgtgtag ccggccaact cgtcagcggt ccagcggcca 3060 



10 



116 Application of Royer, et al 

ccggcctgct tgaccccagc i cagcagcttc ctggcggtga cgccgcgata aaagccgtcg 3120 

aagcccttgt cggccagcaa ctgcagagtg acggccagtt ccggctgctt gaacaggtcg 3180 

ccctcggcga tcggccggcc atgacgcaga taaacctcgc gcgtgcccgg ataacgctcc 3240 

atcaccttac gccgggcctg atagccctcg gccatgcgcg catacaccgg gaagccgtcg 3300 

cgggcgatgc ggatcgccgg cgccagcgac tgccgcagcg gcaaccgacc atgccgggtc 3360 

gccaactcca ccagcgccgc aggcagaccg ggaatgccag cggaccatgg gccgttgacc 3420 

gagcggtcgt ggtccagtgc gcccttggcg tcgaggaacg tgtcgggcgt cgccgattcc 3480 

IS ggcgccactt cgcgcgcgtc cagcatcgtg tcctggcccg tcctggcatc gtgcaggaga 3540 

aagaaaccgc cgccgccgag accggagctg atcggttcga ccaccgacag cgticgaggac 3600 

accgccaccg ccgcatcgaa ggcattgccg ccctcgcgca ggatctgcaa gcccgcctcg 3660 

20 

gtggcgaggc ggtggccgct ggcgattgcg tcaccgggcg gatgcgacgt gggcgcagcg 3720 

gcgctcgcac cgagcgaggc ctgggcccac gccgatgaca tcagcaccca ggcaagcaac 3780 

25 aggacacagc gaacgctacg cctcatgcgc agcccccgct ccgtgtgggt acaactcggg 3840 

gtggtcgcgg cgcaaccgcg ccagcttggc cagcaactgc ggatgggttt ccggaatcgc 3900 

cggatccgga tcgatgcact ccacggggca caccaccacg cactacggct catcgaaatg 3960 

30 

accgacgcat tcggtgcagc gggccgggtc gatcacgtag atcgtctcgc ccatgaagat 4020 

ggcctggttt gggcaggccg gttcgcaaac gtcgcagttg acgcagagcg cgttgatctt 4080 

35 gagggacata gtgcgccatc ggaccctgac cagcgcatct taccggatcg tgatgacacg 4140 

accatgcatt gacctgtaca acggcgccac cgcctacgcc tcgccactgc gggcgacgcg 4200 

tgcgtgagga cgccggcgcg cgcaacgcgc gccggcgatg cacaacctac ttggcttcga 4260 

cgaagatgta atcggcaccg gtcggcttga ccaccgcctc gacacgggca ttgtcggccg 4320 



40 
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gggcgccgat atagaccacg cgcacgccct tcatgctgtt cggctcgacc ttggcgaagg 4380 

ccgctgcgat caggtcggcc atcttggacg atgccgagga gccgaacgcc agcatgtccc 4440 

ccggctgaat accacgaccg acggcggtgg tggcgctttc gacctgacgc tcgtacttgg 

v., 

cctggaactc cgcatccgac tccggcggca agtagtacag gaacgggctg ttggtgacat 

tgcccatgtt ctggattgcc acttgctgca ggtatttttt ccaaccggca tcgtcgtcct 4620 

tggccggggc agtcagcgcc ggttgcgcat cggcgaccgg tttggccgcc tcttcctttt 4680 

tgcaggcgct cacgcccagg gccaacgaga caatcaacaa cgcgcgtgcg gtggtcttca 4740 

15 tggaacggtt tcctctgtgt agtcgatgaa tgacgggcgg ctcagccccg cgtcgcggcc 4800 

tggcgggcct gaaccagcgc ctgcaacacc gaaggcggaa cgaagccgga cacgtccccg 4860 

ccgaggcgcg caatctcgcg caccagcgag gacgagatga aactgtgttg ctcagccggg 4920 

gtgaggaaca gcgtctccac ctcggggatc agatgccggt tcatgctcgc catctggaac 4980 
tcgtactcga aatcggacac cgcgcgcagg ccgcgcagca gaaccccacc accgaccgaa 
25 cgcacgaaat gcgccaacag cgtgtcgaag ccgatcacct ccacgttgcg gtgtccagcc 
agcgcctcgc gggccagggc cacgcgcaat tccagcgaca gggtgggccc cttggacgga 

ctctgcgcca cgccgaccac cacctgctcg aacagcggtg cggcccgatt gaccaaatcg 5220 

atatgcccat tggtgatcgg atcgaaggtg ccgggataga cggcgatgcg gctattggcc 5280 

acggtcatgc gtaggatacc gcgtgaaagt cgccgggcag tttagcagcg gtgcgtcggt 5340 

35 acagggcagc acggacctcg cggctaccgc cctcgcggta caacgcccag ccgaccggca 5400 

gcgctggcgc ctgcccggcg ggcgattcca agtacaacca ggcatcgacc gccaggcgtg 5460 

ccggcaaacg ctgcaacgcc ttctcccaca gaccggccgt gaaaggcggg tcgacgaagg 5520 

cgatatcggc cagcgcggcg ccatcgtgtt cggccagcca gcgcagcgca tcgccctgca 5580 



5040 
5100 
5160 
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ccacctccac ctgagtctgg gcctgcaatc tggcgacggt ggcgcgcaac tgtgccgcct S640 

gggccgggtc gcgttcgatc agacaagcgc tgtgcgcgcc gcgcgacacc gcctccaacc 5700 

ccagcgcacc gctgcctgcg aacaggtcca acacgcgcgc accgggcagt atcggctgca 5760 

accaattgaa cagtgtctcg cggacccggt cggacgtcgg gcgcagcccg gccaggtcgg 5820 

gcaccggcaa gcgcgtattg cgccaacgcc cgccgatgat gcgtacctgc cccgcaccgg 5880 

gacggttcat cggcagcctc gcgcgggtgt gacggcaaca gaaaccaggc gcagggtcgg 5940 

catcgacggc aagcgttgcg gaacggaacc cggatgatag accagccccc tctgcggcgt 6000 

atgcatcggg cacggcttga agccgacctc tgccgccacc atgtccgctg atagagccgc 6060 

gcgcccccgc gcacggccat ccgtcctcca tggagcacct accgatgagc gtggaaaccc 6120 

aaaaagaaac cctigggctitt cagaccgagg tcaaacagct gctgcagctg atgatccatt 6180 

cgttgtattc caacaaggag atcttcctgc gcgagctgat ctccaatgcc tccgacgcgg 6240 

ccgacaaact gcgcttcgag gcactggtca agccggaact tctggacggc gatgcgcaac 6300 

tgcgcatccg catcggcttc gacaaggacg ccggcaccgt caccatcgac gacaacggca 6360 

tcggcatgag ccgcgaggag atcgtcgcgc acctgggcac catcgccaaa tccggcacct 6420 

ccgatttcct caagcatctg tccggcgatc agaagaagga ttcgcacctg atcggccagt 6480 

tcggtgtcgg cttictacagt gccttcatcg tcgccgatca agtggacgtg tacagccgtc 6540 

gcgccgggct gccggccagc gacggcgtac actggtcctc gcgtggcgaa ggcgagttcg 6600 

aggtcgccac catcgacaag cccgagcgcg gcacccgcat cgtgctgcac ttgaaggagg 6660 

aagagaaagg cttcgccgac ggttggaagt tgcgcagcat cgtgcgcaag tactccgacc 6720 

acatcgcctt gccgatcgag ctaatcaagg aacactacgg cgaggacaag gacaagccgg 6780 

aaacccccga gtgggagacc gtcaatcgcg ccagcgcgct gtggacacgg ccgcgcaccg 6840 
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agatcaagga cgaggaatac caagaactgt acaagcacat tgcccacgac cacgaaaacc 



6900 



cggtggcgtg gagccataac aaggtcgaag gcaaactgga atacacctcg ctgctgtacc 6960 

tgcccggccg cgccccgttc gacctgtacc agcgcgatgc ctcgcgcggg ctcaagctgt 7020 

acgtgcagcg cgtcttcatc atggaccagg ccgaccaatt cctgccgctg tacctgcgck 7080 

tcatcaaggg catcgtcgat tccagcgacc tgccgctgaa cgtctcgcgc gaaatcctgc 7140 

aatctggtcc ggtgatcgac tcgatgaagt cggcgctgac caagcgcgca ctggacatgc 7200 

tggaaaagct cgccaaagac gatcccgaac gctacaaggg cgtgtggaag aacttcggcc 7260 

aggtgctgaa ggaaggtccg gcccaggact tcggcaaccg cgaaaagatc gccggcctgc 7320 

tgcgcttcgc gtccacccac agcggcgacg acgcccagaa cgtgtcgctg gccgactacg 7380 

tggcgcggat gaaagacggc caggacaagc tgtactacct gaccggggaa agctacgcgc 7440 

aaatcaagga cagcccgcac ctggaggtgt tccgcaagaa gggcatcgag gtgctcctgc 7500 

tcaccgaccg catcgacgag tggctgatga gctatctcac cgagttcgac agcaaatcct 7560 

tcgtcgatgt ggcgcgcggc gacctggacc tgggcaagct ggacagcgaa gaagaaaagc 7620 

aggcgcagga agaagccgcc aaggccaagc aagggctggc cgagcgcatc cagcaggtac 7680 

ticaaggacga ggtcgccgag gtgcgggtct cgcaccggct gaccgattcg ccggcgattc 7740 

ttgccatcgg ccagggcgac atgggtctgc aaatgcggca gatcctggaa gccagcgggc 7800 

agaagctgcc ggagagcaag ccggtgttcg agttcaaccc cgcgcatccg ctgatcgaga 7860 

aactggatgc ggaacccgat gtcgatcgtt tcggtgatct ggcgcgggtg ctgttcgatc 7920 

aggccgcgct ggccgccggc gacagcctca aggacccggc cgcctacgtg cgtcggctca 7980 

acaagctgtt gctggagctg tcggcgtaag cagtgacgca cctgcgcgtc gcaccgcgcg 8040 

acgcagtcca cagcgcaccg acattgcaaa gaaaagcgga aacgaaaaaa gggcctacgg 8100 
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gccctttttt cttccat:cgt cgacatcggc ttgcagcgca cgagcaacgc tgtaatgcgc 8160 

cggtgcatca cgctcccgac gcgaccagca gcactcacgc ctgcattaac ttaaaacctc 8220 

accagcttag aacttcaccg tcgcccgcgc ccagtaataa cgcccgagta gatcgtaggt 8280 

cgccacatcg gtgttggcat tgacgacgtt gttggcgtag tacagtggcg gctgtcggtc 8340 

ggtcaggttg tccacgccca cttcgaagcg ggtgtgccac ggcttgacct gatagccgac 8400 

ctgcacgctg tgatagacat aggtaccgat gtcgcgcacc acattggcct gaccgatatc 8460 

ggccgacaac ccctgacgcg cgtcggcact gccaatctcg gtgcggccga cataacgcac 8520 

gcgccaagac gcactccagt cgcccagatt ccagctcaac gtgccgaggc cacgccagcg 8580 

cggaaaattg ccgtaagcgt aggtgtactt gccggcattg tggatggtga cggtgtcggg 8640 

atcggcggta ttcggattga tatcgtagcg gatcacatag gtgccgtcga ggctggcatt 8700 

gaagctaccc caggcggtct gcggcaagcg atagttcagg ctgaaatcgg caccgctggc 8760 

ccacaacttg cccagattca cggtcggctc ggcgatatag ttgatcgtgc cattgtcgtt 8820 

gcgatggatc agcggacaga acggactcgc gtcgttggcg tagcactggt tcagcacggt 8880 

ctgcgccgac acctgggtga tggtgtcctt caagtcgatc ttccacaggt ccgtgctcag 8940 

cgacaggccg tccacccaac ccgggtcgta aaccatgccg aaatcgtagg acttgccggt 9000 

ttccggtttg agccggtagc cggccaccac cgcgcctgaa gccttggcgg acacctgatt 9060 

gttctcctgc tggtagctgc catcggtggg cacgtgggca caggcagccg cgtgtccgcc 9120 

gctgtagccg tcgcacggat cgttgaccgc cggggcatca ccgaccacgc cggagtaaag 9180 

ctcgctgatg ttgggcgcac ggaacacctg cgagaccgtg ccgcgcagca acaggttctc 9240 

gaccggacgg tactccaacg ccagtttgct gttggtcttg ctgccgaccg tgtcgtaatc 9300 

ggaaaagcgg ctgccgacgg tcaggtttag cgaatgcacg ccaggcagcc ccgccagcaa 9360 
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cgggaacaac gcctcggcat aggcttcctt gacgctgaaa ctgccaccga gcacgctggc 9420 

gcagtattcg agcaccccgc aga'tgccgtt ctcgtcgccc gtccacagcg gatcggcggc 9480 

ggtcgaggtg cgttccttgc ggtaagccac accggccgcc agactcactg caccggccgg 9540 

cagatcgaac aggttgccgt tgacgttggc ttcgaactgt ttgacggtgt acacgttggt 9600 

caccatcgga ttgacttgca gcgcccgcaa cgcggcctigg ttcccLgggt tgttgaggtt 9660 

gaacacgtcg ate 9673 

<210> 4 

<211> 267 

<212> DNA 

<213> Xanthomonas albilineans 



<400> 4 



ctagccaccg aggcgaccaa caatgcgcaa gagcaagttc accgagagcc agatcgtcgc 



60 



cacgctgaag caggtggagg gcggtcgcca ggtcaaggat gtatgccgtg agctgggcat 



120 



ttccgaggcg acgtacttgc tcttccactg gtaataggtc gccgtgctga tgccgacttg 



180 



gcgacagatg tctttgactg gaacgcctgc gtcggcctgc ttgagcgtgg cgatgatctg 



240 



tgtctcgttg aacttcgatg tgcgcat 



267 



<210> 



5 



<211> 1755 



<212> DNA 

<213> Xanthomonas albilineans 



<400> 5 

ctacttttcg atcagtacgt catcgatgaa catcgcgtcc agaccgctgg tgaagaagat 



cgacagcgca tcctccactg aatgggcgat cggtttgccc atgacgttga agctggtgtt 



aagcaccagg ggaatacccg tcaggcggta gaattctttg atcagggcgt gatagcgcgg 
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gttccaatgt tgcttcaccg tctgcagacg tccggtgccg tcgtggtgca cgacgcccgg 240 

caccttgcgc gtggcttccg cacggaactt cagggtgcgc tccatgtagg gcgattcctg 300 

gtacagctcg aaatactccg cgccgtgctc atgcaaaatc gacggtgcga acgggcggaa 360 

ctcctcgcgg aacttcaccc gcgcattgat gatgtccttg atcgcaggcg aacgcggatc 420 

tgcaaggatc gagcgattgc ctagggcccg tggcccgaat tccgcgcggc cttgcaccca 480 

ggcgacgatc ttaccctcgg tcagcagccg ggccgcgcgt tgtgctgcgt cgtcgaggca 540 

acgagtgaat ttggatagcg cgccgaagcg ctccacgtta tgcaaggtct ccgcactcat 600 

gctgctgccc aggtagggcg attgttcgcg cgcagccggc ggtgtctgct caggatggtc 660 

ctcggcgtgt gcccataatg cggcgcccac cgcgttaccg tcatcgccag gggcggcgaa 720 

tacgtgcaga tgacggaacg gagtttcagc cagcacgcgg ccgttagccg aggaattgag 780 

tgcacagccg ccgcccagca ccaagtggtc ggacaagccc aaagcgtgca ggttgtgcag 840 

gaattcgaag aggacgtcgc agaacacctg ctggccggca taggccaggt tggccaattc 900 

gatcgttggc tggcccttgc atcggcgcat tgcatacagc gtgcgctgca actggctgaa 960 

ttgtgctgcg ggcgcaaacc tcagcgttag gccgtcgacg cgtagcatct ggcgcaacaa 1020 

ctcgtacagt tgccgatcat gttgcccgta ggcggccagg cccatcacct tccattcttc 1080 

gccggacagg gtgccgaagc cgcaaacctc gcagatcata ccgtagaaga agcccaggct 1140 

ggcccaactg ctggtctcgc tttggtggat cggcgtaagc ttgccctgtt ggtagtggta 1200 

gcaggccaaa gcattttttt cacccatgcc gtccagtact gcgcacaccg cctcctcgaa 1260 

cgggctggtg tagcagccgg ccaccgcgtg ggttaggtgg tgctcgtaat gacggtagct 1320 

gggtggcttg aaggcaggct cggccatgtg gctcaagbca tattcgagca ggtgtccggg 1380 

gtgctccacc atcgccagct gcgaacggta gaagaagctc tgtgccacga attgcttgtt 1440 



10 



\ 23 Application of Royer, et al. 

gacgtgccaa ggcaggtcgc cgaaggcgct gcggtattgg tctaccgctt gcgcggtctt 1500 

gcccaggccc tcccgcatca gttcaggtgt ttgcccgctc caactagtag cgacgaccag 1560 

ttcggcgccg ggatcgccgt attcgtggac cagcttgatg gcgcgctgaa acacgtccgg 1620 

ggcaacgccg attgaacgct tgtactgcag gtagcgctcg gtggcctcgg caaagcgcac 1680 

ctgaccatcg tcgccgacga tagcgatggc tgaatcgtgg aaggaattgg cgagtccgat 1740 

gtaagtgcgc ttcat 1755 



<210> 6 
IS <211> 1491 

<212> DNA 

<213> Xanthomonas albilineans 
<400> 6 

20 ctacggcgat gattgtggcg caaattgtgt cagtttgaac tgcaatccca gcgtagagag 60 

cgccacgaaa gcattggccg cgatgtagta caccaccgta gccccgaagg cctgcttgaa 120 
agccagagct acgcctgcc'g gcccctgcag atgctggtgc aggccgctaa agaaaatttc 180 

25 

cgagaccagg gcgatgccca gcataccgcc gacctgctgg atgacctgca gcgcgccgga 240 
acctgcgccg gcatccttca gaggtaccgt acgcatcact gtctggaata gcgaggcgat 300 
30 ggtgatgcca cagcccagtc cgccgatcag caacggcagg gtaagcgtcc agggatccag 360 

cgagccttca ctgcgcgtga tgatgaccca caaggccaga tagctagcga tcatcagaca 420 
ggcgccgctg aagattttcg cgcgtaggct ttcgacgtgc cgagcgagca tagaggcaat 480 

35 

cgccacgccg acagggaaag gagtagtggc gacgccggtt tccagtgccg aatacgccag 540 
tccttgctgc agaaagatca cgaacaccag gaaaaaaccc tgcagcgccg aatagaacac 600 
40 cgacacggac aaggcgccca agatgtagtc gcgatggctc atcaggtaga tcggcagcag 660 



m 
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ggccgggcgc gccaagtggg cttgccgacg ttgccaggcg acgaaggcca ccagcagcgg 720 

aataccgagc gcaatggctg caaagcacca Cagcggccag ccgtatgcgc gtccttctat 780 

tagtgggaac accaggcaca acaaggcgag cgcggccagg gcgatgccga cccagtcgtt 840 

atggatgccc gcatgcgccg gcaccttggg cacccagatg gcggccgcca gcaaggtcac 900 

gaggccgatc ggcacgttga tcaggaagat cgcgcgccag ccgacgccga acgcatcgat 960 

gtggatcagc aagccgctga cgagggggcc ggcgaatgag gccaggcccg cgaccaggcc 1020 

gaacaacgag aaggcggccg cgcgctcctt cggagcgaac atggtttgcg cgatggccat 1080 

cacctgtggt gccagcatgg ctgcggccaa gccctgcaaa gcgcgcgcga tgatgagcac 114 0 

gtggatattg ccagcgatgg cgcagaacgc ggacatcaag ataaaaccgg ccacgcccgt 1200 

gccgaacatg cgcttgcggc cgagcatgtc acccaaccgc cccaacggca gcaaccccaa 1260 

cgcaaacagc agaatgtata tcgctacgat ccattccagc tgttgctcgt ccgcgcccag 1320 

gttcttctgg atactgggca gggcgacatt gacgatgcct acgtccagca agttcatgaa 1380 

attggcgctc agcaacacga tcatcgctgg ccagcgccat cggtagtcga actgcgctcc 1440 

gg'gtggtgcc atgccgggcg gcatacccag cgcttccttg ggtttttgca t 1491 



30 <210> 7 

<211> 954 
<212> DNA 

<213> Xanthomonas albilineans 
35 <400> 7 

ttatccgctt atggccgctt cagccggtcg catggtgacg gtgagaaaat gcaagatgtc 
gcgctccaac tcgcgctgga aggcgctgcg gtcgaagcca ggcggatcga tggccgcctc 
40 gcccacgcga gctttcatcg cctcggggaa tacgctgatg aaggcgtagt ggccggcatt 
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gggcaccacg cgcgcttcca gtcgaccatc attgcctagc gccgcgcgtg tcgccacaat 240 

cgtttcgtgc gcccattgat ccttttcacc gacgatgagc agcaccggta cctcgacttt 300 

cgccagggca tcctcgtgca tgtacaggct gaaatccggc gcaagcgcca ccacggcgcg 360 

cacgcgcgga tcagctgtga ccggcacggc cctgatcggt acccgatttt gtcgcaccag 420 

cgcggtccag gcgggttgtt cggcgtgttc cgggcgatgc gcaaaatcga ccatgaaacc 480 

ggtatgcggt tcgcccccgg cgatcgctaa ggcggtgtag ccgccgacgg agtggccgat 540 

caccgctacg ttatgggcct gaatggcagg accgaactgc gcatggccgg tgagcgtatc 600 

gatcaccgcg cggatgtgcc gggggcggtc ttccagattc tgatagctgt attccagctg 660 

atgctggaac aggttgtcgc ccggatgctc cggcaaggcg acgataaagc cgtgccgtgc 720 

taggtaatgt gccagcgtgc gaaacactag gccggcgctg cgcgtgccgt gcgagatcac 780 

cgctagcgga aacgggccgg cttcgatcgg cgcgcccagg gccacgtcca gcgtataagg 840 

tcccatcgcc gtatcccgtg aaggcgtggc ggtgggatac atcacccaca tcggcaccac 900 

cctgctggca tcaccatcgg tttccagttt ttggcaaccc acatagctat teat 954 



<2X0> 8 

<211> 1356 

;30 <212> DNA 

<213> Xanthonionas albilineans 



40 



<400> 8 

ttacgccggc at cage t tat ccagaettgc accggcgggc gccagccaaa cagcggtacg 60 

accttcgccc agctccctgc ccatcaagcc tcgcacgccg gccagatcct gcggcgttgg 120 

cagcaacgcg gccagccgtt ggcegtcggc atcgtgcaeg gggttgccat ctaagtcgaa 180 

ggcaagcccc ttgcacgggc cgaaatcgcg gtggaaacgc tcgtgcggca gctgtagctc 240 



126 



Application of Royer, ct al 



aaaagccagg ccaaggcgcc tgagctgttg attccagcgg ccgatgatag cgccgacttc 
ggctatgtat tgtcggcgca ttaccgcatt gattgccaac tcggcaggtg cggtggacga 
gaccagtctg tcttcgcagc gcacgtcgac ggcgacctcc gtaccttcga gcttgtcgaa 
gttgcgtggg ctgcggatgc cggcctggta caggacgcgt gaacjgctccg aaacgtcgtg 
gccgaacaga tcaaagatct tcggtagcca atagttaagg tatttctgaa cgactggcaa 
ggggatcgca ccggcgtcga agatcgcgtg cgtatcctca cgcaaggtaa tctcggcgct 
tcgatacagc acgcgctcca ggccatccac gccgaactta atatgcagag gttcctcgaa 
catcatgaag cgggcggtgc gtgccaacgg caggaaagct gactgggtca ccgcttgaat 
ctggtacttg cctacccggt cggcgaagaa gcaccacatg aaatgcgata gccagtcttc 
ggtgtggtag ttgaaggcat cgagcaggcg cgggttctgc gcatcgccac tcatgcgttg 
cagcaggccc tcggcggcgt cggctccgtc gctgccaaag tattcgatca gcagatgcga 
catcgcccag gtgtgccggc cttcctctag aaaaaactgg aataagtgct ccaggtcgat 
cgcgctgggc accatttgcg tcagctcgtg gctctgctca accgcggcgt tttctacgtc 
accttgcacg gtgacatgat ccagcagcag gtcacgatat tcttccggca cgcacgacca 
tgccacctgt cccttgcgct cgccgaacac tacggtgttg cggtctggcg gcatcatgaa 
tacgccccag cggtagtcgc tggggcgcat gcggtgatag cgcgtccact cgctgccctt 
gacgccaccg gtaggcatgc gcaggttcat ctcgcgatcg tggaattcgc tggggccgcg 
gaggcgccac cattgcagga agcgaacttg gtaggaagtc agttgcctga ccagcgcgct 
gtgtggatca aggtcgatat tgttaggaat gtacat 



300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1356 



<210> 9 
<211> 948 
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<212> DNA 

<213> Xanthomonas albllineans 
<400> 9 



tcaggccgcg ctgtgcatcg aggtaggggc gacttcgccg accgtgagcc cgatgaagtc 60 

gatgatccgc tgagccacgg cacccggatc attgaacagt tgcaggtgac cgccacgatc 120 

tagcaggtcc agacgcaaag acgggtcgtg cctagccaac tgcaccgagg cggagtagtg 180 

10 

gctgaagctg tcgtccttgc agtgcacgat cagggtaggg tgcctgccca gggcggtggg 240 

cagcaaggct tgcacggatt gcttgttctc ttcgtaggca cgcatgtagc gcgagaacac 300 

IS caatgtgctc gctggatcgg ctaggtgeag catggtgagc ttttcggcca agtcgtcgcc 360 

gcgtaaaggt tggccgcggt acttgtccag gatggcggcg agctttctgg cctgttccag 420 

accgtgccgc tcgatctgaa ggtagatcgg caaggcgcaa cgttcgaatt cggattttac 480 

gatgggcggc agcaggcccg ccggcgccac ccaagccatg ctgogtggtg cgaagccatg 540 

cagcgcaatg gcatgcacgg ccaattgtgc agcctgacac caaccgacaa aatggcaatc 600 

25 ggcgtagtcg tgttggtgca ggatgcccag cagggtcgcg gcttggcgat ccagatcgaa 660 

gtcttccgcg gttaccgatg tctgggcatt cgggcagccg atggattccc agcacaacac 720 

atggaaatgc ctagccagtc gttgcgccaa ccggctcagc agcaggtagg acatgccata 780 

30 

gggcggtagc agcaccagct tgggcgatgc ctgagcgcct agccaataca gctcaagctg 840 

ccgtccatcg gtagtgcagt attgcgacag ccttacccct gcgagcgcgt cgtccagggc 900 

ggacagatct tgcttttcca gatagtgcgg caagcaagca cagcccat 948 



35 



<210> 10 \ 

<211> 252 

40 <212> DMA 

<213> Xanthomonas albllineans 
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Application of Royer. et al 



<400> 10 



tcacgagtaa 



Gtcgattcga acccacttcc gtctggagag ctcgcgtccc ctaaattctt 



60 



gtcatcgagt. 



tcgcgcagcg ataaggggcg catgtcggtc caggtttcgt cgatatacgc 



120 



catgcactcg 



tccttaccgc cagcatagcc ttccttgcgc cagccaggcg ggacctcaag 



180 



atcggacggc cacagtgaat actgcagttc gtcgttgatc agcaccagat aagcctgttic 240 



<210> 11 
<211> 2151 
<212> DNA 

<213> XanUioTnonas albilineans 
<400> 11 

ttgcgctgcc ttatcataaa taattacgat tcgttcactt ggaatctcgc cgactacgta 60 

gcgcagatct tcggcgaaga tcccctggtg gtgcacaacg acgagtactc ctggcacgaa 120 

ctgaaggacc gcgggggatt ttcctcgatc atcgtttcgc ccggtcccgg ctcggtggtt 180 

aatgaagcgg attttcacat ctcgctgcag gcgctggagc agaacgaatt tccggtgtta 240 

ggcgtatgcc tgggctttca gggacttgcg catgtctatg gtggccgcat cctgcatgcg 300 

ccggtgcGct ttcatggccg tcgctccacc gtcatcaaca ccggcgacgg tttgttcgaa 360 

ggcatcccgc agcgtttcga ggcagtgcgc tatcactcgt tgatggtctg ccagcaatcg 420 

ctgccgcctg tgctgaaagt gacggcgcgt accgattgcg gtgtggtgat gggcttgcag 480 

cacgtgcaac acccgaaatg gggagtacag ttccaccccg aatcgatcct caccgaacac 540 

ggcaagcgca ttgttgctaa ctttgccaag ctggctgcgc gccacagtgc accgttactt 600 

gccgggtcgg agcaggccgg caaggtttta agcgtttgcg cgcccgagat ggtgacaccg 660 

cgggtacgtc gcatgctgag ccggaagatc aagtgccgtt ggcaggcgga agatgtcttt 720 



ctcgaacgtc 



at 



252 



10 



15 



20 



30 



35 



40 



1 29 Application of Royer, et al 



ctggccttgt tcgctgacga aaagcattgc ttctggctgg acagccagct ggtctgcagt 780 

ccaatggcgc gctattcgtt catgggagcg gtgaacgaga gcgaggtagt. gcggcattgc 840 

gtgcggccag ggagcatggt gcaggaggca ggcgagcggt ttcttgctga gatggatcgg 900 

gcgttgcaat cggtgcttac tgaggacgtc gccgagcggc caccgttcgc gtttcgcggc 960 

ggctacgtgg gctacatgag ctacgaaatg aaatcggtgt tcggcgcgcc ggcttcacat 1020 

gccaatgcca tccccgatgc gctgtggatg cgcgtggagc gcttcgttgc cttcgaccac 1080 

gccactgagg aggtatggtt gctggcgctc gccgatacgg aggatctgtc ggcattggct 1140 

tggctagacg ccatcgagca acgtatccat gccattggbc aagcggctcc ggcttgcatt 1200 

tcgctaggcc tgcgcagcat ggaaatcgag ctcaatcatg gtcgtcgcgg ctaccttgag 1260 

gcaatcgagc gttgcaaaca acgcatcgtc gatggcgagt cctatgaaat ctgtcttacc 1320 

gacctgttct cgttccaggc cgagctggat ccattgatgc tctatcgcta catgcggcga 1380 

gggaacccag cgccgttcgg ggcctatttg cgtaacggta gcgattgtat ccttagtact 144 0 

25 tcaccagagc gttttctgga agtggacggc cacggcacga ctcagaccaa gccaatcaag 1500 

ggcacctgcc gccgtgccga ggatccccaa ctggaccgta acttggccat gcgcctggcc 1560 

gcckcggaaa aggaccgagc ggaaaacttg atgatcgtcg acttgatgcg caacgaccta 1620 

agccgcgtgg cggtgcccgg cagcgtcacc gtgcccaagc tgatggacat cgaaagctac 1680 

aagaccgtgc atcagatggt cagcacggtg gaagcgaggc tgcgcgccga ttgcagtcta 1740 

gtcgacctgc ttaaggcggt gttccccggc ggctcgatca ccggcgcacc gaagttgcgc 1800 

agtatggaga ttattgatgg cctggagaat gcgccgcgtg gcgtgtattg cggcagcatc 1860 

ggctacctgg gctacaactg cgtcgccgac ctaaacattg cgatccgcag tctttcttat 1920 

gacgggcagg aaatacgttt cggcgccggc ggcgccatca ccttcctgtc cgacccgcag 1980 



130 Application of Royer, et al. 

gatgagttcg acgaagtgtt gctgaaggcg gaggcgatcc tcaagccgat ctggcattat 2040 
ctacatgcgc cgaacactcc cctgcactac gagttgcgag aggacaagct gctgctagcc 2100 
S gagcactgcg ttagcgaaat gccggccagg caggccttca tcgaaccatg a 2151 

<210> 12 

<211> 414 

10 <212> DNA 

<213> Xanthomonas albillneans 

<400> 12 

atgaggcccc cacgcttacg cgcgaaccag gacgggctgc tgatggatac ggccggccgg 60 

15 

gtggtcgagg gctgcaccag caatctgttc ctcgtcgaga acggccatct ggtgacgccc 120 

gacctgggcg tggccggcgt cagcgggatc atgcgaggca gggtgatcga atatggccgg 180 

20 cagcacggtc tcgcctgcgc ggtaaagcac gtctatccgg accagctagt gcgtgctcag 240 

gaggtgtttc tgactaacgc cgtgttcggc attctgctgg tgcgcagcat tgacgctcac 300 

agctaccgca tcgatcctgt taccctgcgt ttgctcgatg ccctgtgtca gggcgtatat 360 

25 

ttcaccgaac ggtcactaca tcaggtttcc acccatgccg gccaagaccc ttga 414 

<210> 13 
30 <2ll> 603 

<212> Dl!D^ 

<213> Xanthomonas albillneans 
<400> 13 

35 atgccggcca agacccttga aagcaaggat tactgtggag aaagcttcgt cagcgaagat 60 

cgctccgggc aatcgctgga gtcgatccga ttcgaggatt gtacgttccg acaatgcaac 120 
ttcaccgagg ctgagctcaa tcgctgcaag ttccgcgaat gcgagttcgt cgactgcaac 160 

40 

ctgagcctca tcagcattcc gcaaaccagc ttcatggaag tgcgcttcgt cgactgcaag 240 



10 



15 



20 



25 
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atgctcggtg tcaactggac cagcgcacaa tggccatcgg tgaagatgga gggggcgctg 300 

tcgttcgagc gctgcatcct caacgacagc ttgttctacg gcctatacct ggccggggta 360 

aaaatggtgg agtgccgtat ccacgatgcc aacttcaccg aagccgactg cgaggatgcg 420 

gacttcacgc agagcgacct gaagggcagc acctticcaca acaccaaact gaccggcgcc 480 

agcttcatcg atgcggtgaa c&accacatt gacaCcttcc acaacgatat caagcgggct 540 

aggttcagcc tgccggaagc agcctcgctg ctcaacagcc tggatatcga gctgtccgat 600 

tga 603 



<210> 14 
<211> 609 
<212> DNA 

<213> Xanthomonas albilineans 
<400> 14 

atgcatccgc cgtcgccgtt gaacacgcag cagaaagact ggctgacacg cggtggttcg 60 



ttgaccgcgc acctgcgcct gttggggcag gtacaggtgc aagtgcaacg ggagcacaaa 120 

gccatggcct ggctggatga atatcgggtg ctcggactgt cgcgctgcct gcttgtatgg 180 

gtgcgtgaag tggtcctggt ggtggacgcc aaaccccatg tctatgcgcg tagcctgacg 240 

30 ccgctgaccg ccagttacaa cgcctggcag gcagtgcgta gcatcggcag tcgcccgtta 300 

gctgatctgt tgttccgtga tcgcagcgtg ctacgttcgg cgttggcgag tcggcgcatc 360 

accgcgcagc atccgctgca ccggcgcgca tgcaacttcg tggcacagtc gcatgcgacg 420 

35 

caagccctgc tggcgcgccg ctcggtattt acgcggcaag gcgccccgtt gctgatcacc 480 

gaatgcatgc tgccagcgtt gtgggcaacg ctggaaccgg tggcagctcc gcgccaggcg 540 

40 agtctgagtg cggacggccc ttgccggcat tcagcgcaga tcgtctcgcc tgagtcgatg 600 
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ctggaatag 



609 



<210> IS 
<211> 5880 
<2X2> DNA 

<213> Xanthomonas albilineans 
<400> 15 

tcaggcatcg aggtccgtca ccagcatgcg cgccagctca cgcagcgggt cgccttgtaa 60 

catgctgtag tgattgcccg ccgcgctctt gatctgcgaa agtggtacgt aacctgtgat 120 

aCccggcagt acctcgctac ccccgcgtgg cttggacatg tcggcatagg acacgtgtac 180 

cgcagtctgt gcctggtaca gatgagcgtt aggctgcagg cactgcggct cgaaaccggc 240 

caacagcccc aggtgatagc gcgtaacgcg caattgctcg gccagcggtg gccacatgcg 300 

gctttccaga gtaaacctca ggtgtgcgag cagtttgtcc agtgatgcct ggtggcggct 360 

gtagtcgaac acgtgctcgt cgtcaccgtc ggcgaacagc agctgccggg cttcgtctat 420 

ttcagcctgc tcgaagccgc gcttggccaa cgcggccaac gtattgagcg ctgccacgaa 480 

ggtaagctgc cggggctcgc gcgcatgtac cggtatcagg ctgctatcga gcaggcccac 540 

gtaatcgacg cgcaggccgc gccgctgcaa ttgctcagcc acagccaggg ctagcacgcc 600 

gccggaggac cagcccagca agcggtaggg cgcaccggtc gggccagcca gcaacgcatc 660 

gcagtagtgc gcggccaggt cggacaaatg cgcgaagcgg cgcaccggtt cgcattgcag 720 

gccatagacc ctggcagagt gccctagggc ggcagccaga tcgatgtagc aatgaatctg 780 

gccgccgatc gggtggatgg catacaccgc cgcgcgttcg gtgcgtaggc taagcggtac 840 

gataagactt accggcatgc tgccggcttg gctttggcgc atgccgcgct cgacgacagc 900 

ggcaaaatct tccagcacag gggattcgaa cagcgtgttg acccgcactt cgatgtcgaa 960 





.10 
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. Ai^licationof Royer, etal. 



actctggcgg atacgcgaga acaactgggt ggcaagcagc gagtgaccgc cgaggttgaa 1020 

gaagttgtcg ttgaggctca cgcgtaaggg ggcggcctgt gcgggggtca gcagttcgct 1080 

ccacagcttg gccagggtga tttcgacctc gctgcgcgga gcgaggtagt cgctgtcgct 1140 

gctggcggct tgcggctcgg gcaggctcaa ggtatcgagc ttgccattgg gcaagcgcgg 1200 

aagcgccggc agcgactgga agcgcgttgg cagcatgtag gtaggcaagc gttcctgcag 1260 

cagcttgcgc agctcgtcga ggttcaggac accctggcgt ggcaccacgt aggccaacag 1320 

ttccggcgtg ggcgagcctt gcggccaacc gatcacggcg gcctcggcga cctgcaggtg 1380 

ggccgccagg gctttctcca cctggcgcac gtccacgcgg tagccgcgga ccttgacctc 1440 

gtagtcgcgc cggccgagca gttccagcgt accgttgtcc 'agtaggcggg ccatgtcgcc 1500 

ggtcctgtac agacgcgagc cgggggggcc gtagggattg gcgatgaagc gcgcggcggt 1560 

caggccgccc tggcgccaat agccgtgcgt aatgccgagg ctttcgatgt gcacttcgcc 1620 

catgatgcca ggcggcaacg gtcgcagttg ttcgtcgagc acatggacct tggtattggc 1680 

gatgggccgt ccgaccggga cgaagccgct gccgctgtgc tgctcggccg gatcgcaata 1740 

ggtcatgtcg ttgatttcgg tacacccgta gatgtaccag gccgtgcagg caggcagcag 1800 

cgtcctgagc cgttgcagca gttccgccgg gcagggttcg atggagacga agagctggcg 1860 

cagtcgcgcc agccgctgcg gtgtctcagc aacgtggtcg agcagcgcgt tgagctggga 1920 

aggaaaggta tacaggcgtg tgatctgcca ggtttccagc gcgcgcacga aagcggggat 1980 

gtcacgcacg gtabcctcgt cgatgaacac ctgcggtacg ccagcaagta ggccggcgag 2040 

cagttccttg accgaaatgg caaaggcgat cgaggtcttt tgcgccaccc gctccccggc 2100 

ctcgaaaggc gcacgtgccc acagcgcatg cagccagttg aggatttgcc gatggggcac 2160 

catcacccce ttgggacgac cggtggaacc ggaggtatac atcacgtagg ccagctgcgc 2220 
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cggatgcagg gcatgcggca gtggcgtatg cggttgacga gcgatggcgg catcgtccag 2280 
gcgcagccgc ggtacttgga tcagttgccc gtcgatgtcc ttgccgcaga gcaacagccg 2340 
tggctgcgcg tcgtcgagga tctgctgaat gtaggtggtg gggtaatgcg ggtccaacgg 2400 



cacgtagcag ccaccggcct tgagcacgcc gagtagggca atgaggaaat cgggcgagcg 2460 

gccgaaccac agggcgacgc gctcctgcgg gcgcaggccg cgctcgatca ggcaatgcgc 2520 

caggcggttg gcgtgttggt ccagttgggc atagctcaac tgtcggtgtt gatcggcaca 2580 

agccagttcc tcggcgtgca gtgccacttg cgcatcgaac agatccagca cactgcgcga 2640 

ggtatccaga gtatgagggg tgaactcggt gcgagcgacc ggcagcgaaa aatccgagag 2700 

gc^cagcgc ggttcttcca gcatccgctc cagcactctt tggtggtggg ccagcatgcg 2760 

ctgaaccgta gccgccgaaa acagctccgc ggcgtattcg acagtgactt ccaggtggct 2820 

tccgtcgccg atgaactgca ggtccagctc gttgggtgtg gtgcgttcgc caaattccat 2680 

ctgagcgctg aggaagatct gggcaaatgc attgacgcct tcggtggcga aattttggtg 2940 

tcggagcatg atcggcacga gcgggatctg gctgctgtca cgcggtttct tgagagcgct 3000 

taagacatgc tcgaacggca gtgcgcgatg cgcgtaggcg tccagcactt gctggcgcac 3060 

gtgctgcaga aaatcctcgg caaaggcgtg actgcccaag tttaggcgta ccgccaggat 3120 

aCtgacgaaa aagccgatca gattctcggt ttccagctga tcgcgtccgg cgctggtagt 3180 

acctaagcag agttctcgcc ggccggtgta ctggtgcaag acgatcgcca ggctcgccat 3240 

aagtgtcatg aacaaggtga cgcgccgttc ctggctgaat gcggcgagac gcgcggccaa 3300 

ggcgtcggga taggtcaggt gtagtatgcc agcacgccaa gctcgattag ccgggcgtgg 3360 

aaaatcgtag ggcaaggcca gcccttcttc gtaaccabgc aaacgcugtt: tccaataatc 3420 

cagatcggcg ctgaaatcct gtacgcgctg ccatgtagca tagtcggcat attgcagtag 3480 




] 35 Application of Royer, et al. 

cagcggtggc agtgccggtg gcgtctgctg tagcgcggct atatagaaag cacgtaggtc 3540 

gtgaaagatc aggttaatcg accagccgtc gcagatgatg tgatgcatgt tcatcaggaa 3600 

cacgtggtaa tcgtccgata cgcgcagcac cgataccttg agcagcgggc cgtgggcaag 3660 

atcgaatacg Cgcgcggcgt gctcggcgac taggcgtggc acttctgcgg gtgtcgctgt 3720 

gatgcaaggc actgggacct gcatggcgtc ggcgatgtgc tggctgggat aatcgccgcc 3780 

agcgcaagtt gctatgcggg tgcgcaaggt ttcatgcctg gccaccagcg cctggatcgci 3840 

ctcgcgcagc gctgacatcg agaaatcggc actgcgtaaa tggcaggcga aggcgacatt 3900 

gtaactggta cgttgctcgg gcatgtgttc atgcacgaac cacaggcgct cctgctgata 3960 

gctcagcgga acgggagcat cgcgcacggc acgggaagag atggtgttgc cgccagtcgg 4020 

agcctgttgt: tgccgcgctt cgttgaccac tcgcgcaaaa tcttccagca caggggattc 4080 

gaacagcgtg ttgacccgca cttcgatgtc gaaactctgg cggatacgcg agaacaactg 4140 

ggtggcaagc agcgagtgac cgccgaggtt gaagaagttg tcgttgaggc tcacgcgtaa 4200 

gggggcggcc tgtgcggggg tcagcagttc gctccacagc ttggccaggg tgatttcgac 4260 

ctcgctgcgc ggagcgaggt agtcgctgtc gctgctggcg gcttgcggct cgggcaggct 4320 

caaggtatcg agcttgccat tgggcaagcg cggaagcgcc ggcagcgact ggaagcgcgt 4380 

tggcagcatg taggtaggca agcgttcctg cagcagcttg cgcagctcgt cgaggttcag 4440 

gacaccctgg cgtggcacca cgtaggccaa cagttccggc gtgggcgagc cttgcggcca 4500 

accgatcacg gcggcctcgg cgacctgcag gtgggccgcc agggctttct ccacctggcg 4560 

cacgtccacg cggtagccgc ggaccttgac ctcgtagtcg cgccggccga gcagttccag 4620 

cgtaccgttg tccagtaggc gggccatgtc gccggtcctg tacagacgcg agccgggggg 4680 

gccgtaggga tbggcgatga agcgcgcggc ggtcaggccg ccctggcgcc aatagccgtg 4740 
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cgtaatgccg aggctttcga tgtgcacttc gcccatgatg ccaggcggca acggtcgcag 4800 

ttgttcgtcg agcacatgga ccttggtatt ggcgatgggc cgtccgaccg ggacgaagcc 4860 

gctgccgctg tgctgctcgg ccggatcgca ataggtcatg tcgttgattt cggtacaccc 4920 

gtagatgtac caggccgtgc aggcaggcag cagcgtcctg agccgttgca gcagttccgc 4980 

cgggcagggt tcgatggaga cgaagagctg gcgcagtcgc gccagccgct gcggtgtctc 5040 

agcaacgtgg tcgagcagcg cgttgagctg ggaaggaaag gtatacaggc gtgtgatctg 5100 

ccaggtttcc agcgcgcgca cgaaagcggg gatgtcacgc acggtatcct cgtcgatgaa 5160 

cacctgcggt acgccagcaa gtaggccggc gagcagttcc ttgaccgaga tggcaaaggc 5220 

gatcgaggtc ttttgcgcca cccgctcccc ggcctcgaaa ggcgcacgtg cccacagcgc 5280 

atgcagccag ttgaggattt gccgatgggg caccatcacc cccttgggac gaccggtgga 5340 

accggaggta tacatcacgt aggccagctg cgccggatgc agggcatgcg gcagtggcgt 5400 

atgcggttga cgagcgatgg cggcatcgtc caggcgcagc cgcggtactt ggatcagttg 5460 

cccgtcgatg tccttgccgc agagcaacag ccgtggctgc gcgtcgtcga ggatctgctg 5520 

aatgtaggtg gtggggtaat gcgggtccaa cggcacgtag cagccaccgg ccttgagcac 5580 

gccgagtagg gcaatgagga aatcgggcga gcggccgaac cacagggcga cgcgctcctg 5640 

cgggcgcagg ccgcgctcga tcaggcaatg cgccaggcgg ttggcgtgtt ggtccagttg 5700 

ggcatagctc aactgtcggt gttgatcggc acaagccagt tcctcggcgt gcagtgccac 5760 

ttgcgcatcg aacagatcca gcacactgcg tgaccaatcc aaggcgagcg cggtatccgg 5820 

actggccgcg gtcagcgcga cgtcttcggc atccagtagc gacatgcttg atagtttcat 5880 



<210> 16 
<211> 993 



10 



\27 Application of Royer, et al. 

<212> DNA 

<213> Xanthomonas albilineans 

<400> 16 ' 

ttaaacgtgc agcttgagca tggcttcgcc atcgctgcgc gcagcgatgt caggggacga 60 

ttgttcgggc gaataaggtt cggccatgca cacgaggatc ttgcggctgc cctcgtaggg 120 

ctcgcgtccg tgcgaaacca gcatattgtc gatcagcagc acgtcgtccc gatgccagtc 160 

aa'aatggatc ttgtgctggg cgaaaactgt gcgcacatgg tcgagcatgg cggggtcgat 240 

cggcgtgcca tcgccgaaat aggcgttgcg cggcagtccc tgctcgccga agaacgacag 300 

IS catcatcttc tgcgcagctg cctccagcgc agtgtaatga aacaggtgtg cttggttgaa 360 

ccaaacttca tcgccggtcg ccggatggca cgcaaaggcc cggcagatct ggctggtgcg 420 

caggccgtcg ccggtccatt cgcatiigcat gtcgttgcgg gcgcaataag cttctacttc 480 

20 

ctgcttgttg cgggtigttaa acacgtcctc ccatggcagg tcgacccctg cacggeagtt 540 

cctgacgtag cgcacctgtt tgcgcgcaaa gatttcgcgc acttgcggat cgatagcggc 600 

25 tgtgaccttg agcatgtcag ccaacggcgt gcagccgccc tcgctggccg gctgcacgca 660 

atggaacagc agtttcatcg gccagacgcg ctggtaggcg ttctcgcaat gttgcgctat 720 

ggacagctgc ctgggatatt cggtggccgt gtagacatgc tggccgacgt cggtgcgcgg 780 

30 

tgtggaacga taggtatagg ccagtcgctc atcgaagaaa cagcgtgaga tctgctccaa 640 

gccaccaggg tgcgcaaagc cacggaacag taacgccctg tgttgccata gcagggtagg 900 

35 ccacgtcgcg cggtgcgtgg cattccaatc agtcagcgtg gcctcggccg agtcggcctt 960 

gatggtcagg ggaagatcag cgtttgtgtg cat 993 

40 <210> 17 

<211> 2296 
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<212> DNA 

<213> Xanbhomonas albilineans ' 
<400> 17 

ctaggttcga gcatggccga cgaagaacag gtgggattgc aggtgcttgg cgctagccag 60 

tgtctccaac agcgcaggcc ggattacctt gccgctgcag gtacgtggga tcgtgctcac 120 

btcgacgaat agatggggat agtgatgctt gcctagcgcg ttcttgcaca aggcgcgtaa 180 

ggcggcccat agcgcccctg tatcgatgct ggcatcgaca ggcaccacga aggcggcggg 240 

gcggggcaag ccgaactcgt cttcgatcag gcagatcgcg cactccttca cgcaggcgtg 300 

15 cgtctggatg acgctctcca gcgtctcagg cgaaagccaa caaccgctga tcttgatggc 360 

ggagcccatt ctgcccaagt tatggaagcg ccccttggcg tcggcaaaga acaggtcgcg 420 

tgtatcgaac cagccgtcga cgaacaattg ggcactgagt atggggtcgc caacatagcc 4 80 

20 

cctcgtcagc gtattgcccc tcacccacag gctgccgact tcgcctatgc ggcagacctc 540 

tccctgcttg ttcaccagct tcacaacaaa gcctggtacc ggcgtgccag tgcaacccat 600 

25 gagcgcgtgg cctggccgat tggagatgaa ggtggacagt acctcggtgc agccgatacc 660 

gtcgagcact tcgacctgcc aacgcgtgct gatcgcatga ccaagcctog ccggcaagct 720 

ttcgccggcc gatatgcaca ggcgaagcgc cggccacacc gcatccggcg cggcctcggc 780 

30 

aagcaacagc ttgaacacgg cgggcacggc gagcaataca gtgacgtggt aagtgtggat 840 

ggtttgcgcg atctgcctga cgctaagcgg cgcggcaatc acatggctga caccagcgag 900 

35 cagcgacagc atcaggttgt tcaggccgta ggcgaaaaac aaccgcgacg gtgtatacat 960 

cacgtcatcg ctgcgcagcc cgagcacggc ctgctggtag ttgagatggc agtgcataaa 1020 

atcggcatgc gagtgcgtta ccgccttggg cgtaccagtig gagccggacg tgcatatcat 1080 

caccgcgggti gcatcggcgg agcagggcgc aaccaccagt tcgtcgtttt cgatcaccgg 1140 



40 
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catcaggctc gt cage t eta aggtcggcag gtggcgcaac gcggeatgat: gggaaggcgg 1200 

cagtbcggca tcgatcagca cgaggcgagg cttgatggtc ttgagcgtgg tctcaaagtg 1260 

gacgagcgac acaagetcgt ttatcggcgc aaagaccaag ccaeeggeca agcaggceag 1320 

catcagggca acgcccgcta ggctgtegat agcaateagc gccaccgcat caccgctttg 1380 

caggccaagc agactcaagt gccgggcata ggtcgccgcg egagagcgca actggcgata 1440 

gctgaaggec tgctggcgca acggatcgat eatttgegec gtcgaagcca gatgcgetgc 1500 

ggaaaaaatt tgcgcgcaca cgttgacctg gaccgacggg aggaagccga taggcgcaca 1560 

ggegaaeacc gcegcgcttg egtccgacca ggacggcatc ggcccatcgg tegagcgtgc 1620 

gaaccagctc gcgggcaaat gactggcaat ctgaeatagt ttgecgtggt cgcaggccag 1680 

cagactggta tccaectcga tcaggtcttc gaegcaggaa agegcaggca aagagatctg 1740 

cgccgegctg ccgcatacgg caetatcgeg caagteeggc aggttecttt ggcggtggtc 1800 

cgcatgccat agcagcaggc catcgctgcg tcgcgtggcc aaggcttcca gtgccatgcc 1860 

cagggcgcet tgcaaccggt ccagttgctt gggttcctcg atacggccea aatecagtgc 1920 

cagggtgttg gcgccggggg gcgattcgga cagcacgatt tggtgccgat gcttgaggta 1980 

atcgeaaatc aggccggcca gcttgcctaa ccgtgcatat tcccgtagca ggctaccgaa 2040 

gctigecaeag gggtaaggtg cggcatagtc aatggttatg tgctggccga teggcgtgtc 2100 

gctgacatcg atacgcaagc caggataatc gcgccgccat tgatgcagca gcgtggtcca 2160 

ttgacgagca taggcctcgt tgcgccetgg ccgcgtetga cctacegaec aacgcaaate 2220 

tagttcggtg ccagagtgca tcggaagatt tgtcagtggg ctatccataa gcgttctcgg 2280 

gtaaggcgat cgacgeat 2298 
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<210> 18 
<211> 861 
<212> DNA 

<213> Xanthomonas albllineans 
<400> 18 

tcagcgtttc agcgcgtgga ccagataact ttgcggcacg ccgtgggtgc cgcgcggtgc 60 

gaccggaaac ggccaacgcc ccatttgctg gccagcggcg gcgatggtca acacccctgg 120 

ctcccagccc agtggttgga tcaggctttc cggttcgtca gtgccaaact ggcgagbcat 180 

cgcgtgcagc actcgggcat tgggcgagtt gagcatagag aggccgataa catcgaacaa 240 

cacgctgctg cccttggcac tcaatgcatc gatgcgcgcg aacagcagca tcactgcctc 300 

ggcgctcaag tagcacagca agccctcgac cagccacaag gtggcggcgc tgccgacgaa 360 

tccactctcc ttaagtgcct ggggccagtc ttcgcgcaaa tcgatcggta gcgcaatgcg 420 

ctggcaaacg ggctgggcgt catggagttt ttcgtgcttg tcggagagga catccatgtg 480 

gtcgatctcg tagacccggg tatcggacgg ccaggggaga cgataagcgc gtgcatccat 540 

accggcggcc aggatcacca cctggccaat gccttcacta accgcctgca tgatcttgtc 600 

gtcgagccaa cgcgtccgta cctcgatcgc cggaggcatc ggtacgttct ggttgttgcg 660 

tctgagctct tcaacgaatt catcgccggc cagacgccgt gcgaaagggt catggaacag 720 

cgcctgctcc cgctcgcttt ccagcgcccg catgcctgcc acccataaag cggttctttc 780 

gatatctctc atgcatacgc tccggttcgt ggtcggcttg cgccgatgca tcatagatat 840 

gcatgactcg attcgcggca c 861 

<210> 19 

<211> 720 

<212> DNA 

<213> Xanthomonas albilineans 
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<400> 19 

ttacggatgg tctgatccac gcaagcgaaa gatgagataa accacatcag ctgtcaacgc 60 

cgatttaaab ttgacccact ttcctttgaa tcgtcgaagt aaatctgacc cacccggggt 120 

cttccatcgt cgggctgcta ggctgcgcag ggcaaagccc gtcgcagccc agcagccctg 180 

cgccggctca cgcccgaagg gcaggtagcc gatctcgtcg accaccagca gcttcggccc 240 

tagtaccgcg cgattgaagt agtccttcag ccggttctgc gccttgaccg ctgccagttg 300 

catcatcagg tcggccgcgg tgatgaaacg tgccttgtgc cccgccatca ccgcacgctg 360 

gcacagcgcc agggcgatgt gggtcttgcc gacaccgctg gggccaagca tcaccacgtt 420 

ctcggcgcgc tcgacgaagg tcaggtggcc gagctcgacg atctgcgcct tcgaggcgcc 480 

gccggcctgg gcccagtcga actgctccag cgtcttgatg gacggcatcc tggcaagtcg 540 

cgtcagcacc gtgcgcttgc gctcttcacg cgcgagctgt tcgcttgcca gcaccttctc 600 

caggaagtag ctggcatcct cgcacgcggc ggcctgtgcg agtgcttgcc agtccgagct 660 

caggcgtgcc agcttcaact gctcgcacag cgcggcgatg cgcgcacact: gcaggtccat 720 

<210> 20 
<211> 20640 
<212> DNA 

<213> Xanthomonas albilineans 
<400> 20 

ttgcccaatg cgctcatgca gataactctt gtagccgtcc agtttgcagg cgtattgtta 60 

ggcgtcaccg ctcgcgcggc gatccccaat aaggcgggta tgagacgcgc atggccgccc 120 

ttcccgcagg cgtgctgtcg ctctattgct tacctcatgc agagatcgcc aatgtcgccg 180 

ttacagcaaa cgctgctaac ccgcctcgcc agtgcggccg cctcccggac aatgatcgag 240 

tttccgcgtc cggagcacgc atcgccacaa tgttgcgacg atgccgagct tgcgcgactg 300 
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atcgtgcagt tgtcggcggg actgcaaccg ctggcgatgc cgggtaccta cgtgatcatt 
gccgegocac atggtggttt gttcgcggca gccctgcttg cctgtttgca tgccaacctg 
gtggcggtgc cgtttccact ggatgttgot cagccaaatg agcgggaaca ggccaggotg 
gagacgatcc acgcacaatt gatggagcat ggoaatgtag cggttctgct tgacgatgtc 

* 

gccgatcgca gtgccttcgc gcgcatggcg catgctgcgg gcaccttcct ggcgaccttc 
gccgatctaa agcgcgaatc gaccagcgcc tccttgtgcc cggcgtcgcc ttcggacgcc 
gccttgctgt tgtttacctc tggttcctcg ggtgagtcca agggcatcct gcttagccac 
cgcaacctgc atcatcagat ccaggctggc atccggcagt ggagcttgga cgagcatagc 
catgtggtga cctggctttc tcccgcgcac aacttcggcc tgcatttcgg cttgctggca 
ccctggttca gtggcgcgac ggtcagtttc atccatccgc acagttatat gaaacgaccc 
ggcttctggc tggagacggt tgcggctaga gacgccacgc acatggccgc gccgaacttc 
gcgttcgact actgctgcga ctgggtgatg gtcgagcagc ttccgccgtc tgcgttgtct 
acgcttacgc atatcgtgtg tggcggcgag ccggtgcgcg cctcgaccat gcagcgcttc 
ttcgagaaat tcgccggact cggtgcgcgt acgcagactt tcatgccgca cttcggcttg 
tctgaaaccg gtgcgctgag taccttggac gaggcgcccc aacagcgcgt cttggaacta 
gatgccgacg ccttgaacaa acgcaagcgc gtggcggcag gggcgagcca ggcgcgtgtg 
acagtgctca attgcggcgc cgtcgaccaa gatgtggagt tgcgtatcgt ctgtcctgaa 
ggcgagacgt tgtgcagacc agatgagatc ggcgaaatat gggt^aagtc gcctgcgatc 
gcccgtggct acctgtttgc gaagcccgcc gatcagcgac agttcaactg cagcatccgt 
cataccgacg atagcggtta ctttcgtacc ggcgacctgg gtttcattgc cgatggctgt 
ctgtatgtca ccggaagggt aaaggaggtg ctgatcatac gcggtaagaa tcattacccc 



360 
420 
480 
540 
600 
660 
720 
7B0 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
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gcacatatcg aagcctcgat cgccgctacc gcatcgcctg gcgcgctgat gccggtggtg 1620 

ttcagcatcg agcggcagga cgaggagcgc gtagctgcgg tgatcgccgt caatcacccg 1680 

5 tggacgccgg cagcatgcgc cgcgcaggca cacaagatcc ggcaacaggt agccgaccag 1740 

catggagtcg ccctggcgga gctagccttt gccgaacacc ggcacgtgtt cggcacctat 1800 

ccgggcaaac tgaagcggcg cctagtcaag gaagcctatg tcaacggcca gctgccgttg 1860 

10 

ttatggcatg agggtaagaa ccgggacgta ccagcggccg ccgcggacga tcggcaggcg 1920 

caacacgtgg cggacctgtg tcggaaggtc tttttgccgg tgttgggtgt cgcgccgccg 1980 

15 catgcccaat ggccgctgtg cgaactggcg ctggattcgc tccaatgcgt gcgtcttgcc 2040 

ggtgccatcg aagagtgcta cggcgtgcct ttcgaaccca cgttgctatt caagcttgag 2100 

acggtcgggg caatcgccga atatgtcctg gcgcacggac gtcaggcgcc cacgccgacg 2160 

20 

cgtgcgccgg tggcaagcac aacatgctca gaggaaccga tcgccattgt ggcgatgcac 2220 

tgtgaggtgc ccggagcggg cgagaacact gaagcattgt ggtcgttcct gcggagcgac 2280 

25 gtcaacgcga tccggccgat cgaatcaacg cgcccggact tatgggcagc gatgcgcgcc 2340 

tatcccggcc tcgcgggcga acagctgccg cgctatgcgg gtttcctcga cgacgttgat 2400 

gctttcgatg ctgcgttttt cggtatctcg cgtcgcgagg ccgaatgcat ggacccgcag 2460 

30 

cagcgcaaag tgctggagat ggtgtggaag ctgatcgagc aagccggtca cgatccgctg 2520 

tcctggggcg gccagccggt cggcctgttc gtgggtgcgc atacgtccga ctatggcgag 2580 

35 ctgctggcga gccagccgca actgatggcc caatgbggcg cttacatcga ttcgggttcg 2640 

catttgacca tgattccgaa ccgggcttcg cgctggttca atttcaccgg ccccagcgaa 2700 

gtaatcaaca gcgcttgctc cagctcgctg gtggcgctgc atcgggcggt tcaatcgctg 2760 

40 

cgccaaggcg aaagcagtgt cgccctggta ctcggcgtga accttatcct ggctcccaag 2820 
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gtgctgttag ccagtgcaag cgcgggcatg ctttcgcccg atggccgctg caagacgctt 2880 

gacgccgccg ccgatggctt cgtgcgttcg gaagggatcg caggggtgat attgaagcca 2940 

ctggcgeagg cgctggccga tggtgacagg gtctacggtc tagtccgcgg cgtggcggtc 3000 

aaccatggcg gccgttccaa ttccttgcgt gctcccaacg tcaacgcgca gcggcaactg 3060 

ctgatccgga cttaccagga agccggtgtc gagccggcca gcgtcggtta tgttgaacta 3120 

cacggcactg gtaccagcct gggtgatccg atcgaaatcc aggcgctgaa ggaagctttc 3180 

attgcgttgg gggcacaggc cgccccgtca aactgcggca tcggttcggt gaagtccgcg 3240 

15 ctgggccatc tagaagccgc tgcaggcctg accggcctga tcaaggtgct gctgatgctc 3300 

aagcacggcg agcaggccgg cacgcgccat ttcagcacgc tcaatccgct gatcgatttg 3360 

cgaggtacgt cattcgaagt ggtggcgcag catcgcgcat ggccgtcgca ggtcggcatt 3420 

cacggcacac tcttgccgcg tcgcgcgggt atcagctcat tcggcttcgg cggcgccaat 3480 

gcgcatgcga tcgtggaaga gcatgtcatt gccacgcccc cctcgacgag ctccgctggc 3540 

25 ggcccggtag gtatcgtgtt gtcagccggt agtgaagctg tcttgcggca acaagtgctg 3600 

gccttgtcag cctggctaag gcagcaatcg ccgacacccg cgcaaatgat cgatgtcgcc 3660 

tacaccttac aggtaggacg cgcagccctg tcgcacaggt tggcttttag cgcgacggac 3720 

30 

gccgagcagg cattggcgag gcttgagggt cgtctggcgg gcgtgatgga tgccgaggtc 3780 

catcacggtg tcgtggatgc tgccgcaacg gctcccgaac atgggcggca gacgcgcgaa 3840 

35 ggtcttgccg gtttgctgcg agcctggact cagggcgtgc gcgtcgattg gtcggcgctg 3900 

tacggcatac agcgaccgca gcgcgttagc ctgcctgtct accccttcgc tagggaacgc 3960 

tattggctgc ccggccaggc tatgcatgcc gctgcggacg ctcatccgat gctgcagctg 4020 

ttgcatgcca atgccaaact acatcgctac gccttgcgta ggtccggctg cgcaagcttt 4080 



20 



40 
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cttgttgatc attgcgtgga tggtcgacag gtactaccgg cagccgtgca actggaattg 4140 

gtgcgcgccg tggcgcagcg ggtcatggcg caggatgagg gttgtatcga actggcgcag 4200 

gtcgcctttt tgcatcccct catgatggag gagactgagc tggaggtcga aatcgaactg 4260 

tcgaagagcg atcaagatga gttcgatttc caacttcacg atgctcaccg ccaacaggtc 4320 

tttagccagg ggcacgtacg tcgccgggtc tatacggcga caccgcgctt ggatttagcc 4380 

cagctgcaaa agctttgtgc cgagcgcgtg ttgtccggcg aagactgtta tgcgcacttc 4440 

accgcctgcg gattgcagct cggcgaccgg ctcaaatccg tgcaatcgat cggctgcgga 4500 

IS cgcaatggcg agggcgagcc gatcgcattg ggtgtcctgc gcctgccacc atcaagcgtt 4560 

gaagacagcc atgtigctgcc tcctagcctg cttgatggtg ccttgcagtg tagccttggc 4620 

ttgcagcgtg atgtcgagca catcgccatg ccatacacgc tggagcggat gacggcgcat 4680 

20 

gcgccgattic ctcccgaggc ctgggtgctg ctgcgtcacg gccatgcagc cagacagtcc 4740 

ctggacatcg atctcctgga ttccgaaggt agggtctgcg tcagcctcgg caattacacc 4800 

25 ggccgtgcac cgaaagccgt ttccgccgtc agggcgcttg tcttggcacc ggtctggcaa 4860 

gcgttgaccg aaacggcgcc ggcatggccc gatccggccg aacgcatcgt tacggtagga 4920 

gacgatgcat ggcgtagtca cttcggtttc gacgagccgg ccttgtccct ggaggacagc 4980 

30 

gtcgaagtca tcgcgacgcg actgggccag agcggcaagt tcgatcatct agtctggatc 5040 

gtgccgatag ccgagagtga aaccgatatt gcagcgcaag gttcagcggc gatcgccggt 5100 

35 ttccggttgg tcaaggcgct gcttgcgttg ggctatgcgc atcgcccgct gggtctcacc 5160 

gtgctgactc gccaagccct tacgcggcag ccgtcgcacg cggcagtgca cgggctgatc 5220 

gggacgctgg ccaaggaata ctgcaactgg aaaatccgtc tgctcgacct gccgagcgta 5280 

aaatcttggc cgcaatggga gcaattgcgg tcgttgcctt ggcatgcgca gggcgaagcc 5340 



40 
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ctgatcggcc gtgggacttg ttggtatcgg cggcagttgt gtgaagtgct gccgctgccg 5400 

tcgttggaac cgccgccgta ccgcgtaggc ggtgtctacg tcgtgatcgg cggcgctggc 5460 

ggcttgggtg aagtattgag cgaacacttg atccgcacgt acgacgcgca gctgatctgg 5520 

atcgggcggc gcgtgctgga cgaaggcatt gcgcgcaagc agacccggct tgcgtcgctg 5580 

ggccgcgcac cgcattacat ctccgcggac gcgagtgacc cggctgccct gcaggcggca 5640 

cataatgaga tcgttgcgct gcatggccag ccccatgggc tcatcctaag caacabcgtg 5700 

ctgaaggatg ccagtctggc tcgtatggag gaagccgatt tccgtgacgt gctggccgcg 5760 

aaactcgacg tcagogtgtg tgcggcacag gtgttcggca cggcccccct tgatttcgtg 5820 

ctgttttttt cttccatcca gagcactacc aaggcggccg ggcaaggtaa ctacgccgcc 5880 

ggctgctgct atgtcgacgc tttcggcgag ctatgggcgc gccggggttt gagggtaaag 5940 

accatcaact ggggctacbg gggcagcgtg ggcgtcgtag cgggcgagga ctatcgccgg 6000 

cgcatggcgc aaaaacacat ggcttcgatt gagggtgccg aagcgatgca ggtgttgtcg 6060 

cagttgttgt gtgcgccgtt gcaacggctt gcctacgtca agatcgacga tgctaacgca 6120 

atgcgcgctc tgggcgtagt agaggacgag agcgtgcaaa tccctgtgca cgcaccggcc 6180 

gagcctcGca gagggcagcc tggtcccgtg gtcgagttgt cggtgaatct ggatgcccgg 6240 

cgcgaacggg aaactttgct ggcggcctgg ctgcttgagt tgatcgagca actcggtggt 6300 

tttccgccgg caagtttcga catcgctacg cttgcgcaac gcctgcacat cgtacccgcc 6360 

tatcgaagct ggctggaaca cagcgtgcgg atgctcggcg tgtatggtta cctcagagcg 6420 

acgggggaaa gccgattcga gctggccgac aagccgcccg atgatgccag gggtgcctgg 6480 

aacgcgcatg tgcacgaggc cagcgtcgaa gccggtgaag aggcacagcg gcgtctgctc 6540 

gatcgctgca tgcgggcgtt gccggcggtc cttcgaggcg aacgcaaggc caccgaattg 6600 



I 
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ctgtttccgg aaggttcgat ggcgtgggtc gagggtatct accagaacaa cccgcttgcc 6660 

gattacttca acgcacaact agtcacgcga ctgattgcct acttgagacg acgactagag 6720 

tcgacgccta cggcgcgcct gaagctgtgc gagatcggcg ccggcagcgg tggtactact 6780 

gcaagcgtgc tacaacagtt gcaggcatat ggtgagcata ttgaggaata tctctatacc 6840 

gacctgtcgc ctgtcttcct gcatcatgcg gaaaaacact atcagccacg agcgccttat 6900 

ttgaggaccg cctgtttcga cgtagcgcgc gcgccgacgg cgcaggccct ggaatctggc 6960 

ggctacgacg tggtgattgc cgccaacgta ctgcatgcta cgcgcgatat cgccaagacc 7020 

ttgcgcaatg cgaaggcact cctcaaacct ggcggtctgc tcttgctcaa cgaagtgatc 7080 

gagcgcagcc tcgtcttgca cctgactttc ggtctgctgg agagctggtg gttgccccag 7140 

gacaagatct tgcgccttgc cggctcgccg ttgctggctt gcgccacctg gcgcagcctg 7200 

ctggaggctg agggttttgc ggggctgagc gtgcacaggg cgcaacccga tgccgggcag 7260 

gccatcatct gtgcctacag cgatgggata gtgcggcaag ccagtacgat cgaggttgcg 7320 

cggaatgaaa aagtaaccgt tccgtcgcag ccggcggaag ccggggaatc gccgctggat 7380 

ctggLcaaaa aactgcttgg acgcattctg aaaatggatc cggccacact cgataccagc 7440 

cacccgctgg agtactacgg tgtcgattcg atcgtggcga tcgaactggc tatggcactg 7500 

cgcgagacat tcccgggttt tgaagtcagc gagctgtttg aaacgcaatc catcgatacc 7560 

ttgttgggct ctcttgagca ggctcctctc cttgctaccc tcacagctcc gccgcaacaa 7620 

gacatgctgc agcagctgaa acaactgctg gcgcgtacgc tgaagctgga cattacgcag 7680 

atcgacacga gcaagacgct ggagagctat ggtgtcgact ccatcgtcat catcgaatta 774 0 

gccaacgcct tgcgtgagcg ctatccgagc ttggacgcgt cacagctgat ggaaacctta 7800 

tcgatcgacc ggctggttgc ccaatggcag gcaacggagc ccgccgtacc ggcagagcca 7860 
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t 

acagcggaac cgccggtagc cgacgaagac gccgctgcca tcatcggact ggccggccgc 7920 

tttccaggcg cggacacgtt ggaggagttc tggaacaacc tgcgcaacgg ccaaagcagt 7980 

atgggagagg tgccaggcga gdgctgggat caccagcact acttcgacag tgaacgccag 8040 

gcaccgggca agacgtatag ccgctggggt gcgtttctga gggacataga cggcttcgat 8100 

gcagccttct ttgaatggcc cgacagcgtc gcgctggaat cggatccgca agcgcggata 8160 

tttctagagc aggcctatgc cgggatcgaa gatgccggct acacgcctgg ctcgctcagc 8220 

aagagccaac gcgtaggtgt attcgtaggt gtgatgaatg gttactacag cggcggagcg 8280 

cgct:tctggc aaatcgccaa ccgcgtgtcg taccagttcg attttcgcgg gccaagcctg 8340 

gcggtggata ccgcctgttc ggctticgctc accgcgatcc acctggcgct ggaaagcctg 8400 

cgcagcggca gttgcgaggt cgcactggcc ggtggcgtiga atctgctggt cgatccgcag 8460 

caatatctta attcggctgg cgccgcgatg ctctccgccg gcgccagctg tcggccgttc 8520 

ggcgaggccg cggacggttt cgtggccggc gaagcctgcg gcgtggtgct gctcaagccg 8580 

ctcaagcaag cgagggccga tggcgatgtg atccatgccg taatcagggg cagcatgatc 8640 

aatgccggtg ggcacaccag cgcgttctcc tcgcctaacc ctgccgccca ggccgaagtc 8700 

gtgcggcagg ccttgcagcg cgcgggcgtg gcgcccgatt cgatcagcta catcgaggcg 8760 

catggcaccg gcaccgtact aggcgatgca gtggagttgg gtgctttgaa taaagtgttc 8 820 

gacaagcgcg cggcgccatg cccgatcggc tcgctgaagg cgaacatcgg ccatgccgaa 8880 

agcgccgcgg gcatcgccgg cctggccaag ctggtattgc agttcaggca tggcgagttg 8940 

gtgcctagtc tgaatgcgtt tcccttgaat ccctatattg agttcggtcg cttccaggta 9000 

caacagcagc cggcaccgtg gccgcgccgt ggcgcccagc cgcggcgcgc cgggttatct 9060 

gccttcggtg ctggcggatc gaatgcgcac ctagtggtag aggaagctcc ggctabggct 9120 
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cccggggtct cgatcagcgc cagctctcca gccttgatcg tgctttcggc gogaacgctg 9180 

cctgccttgc aacagcgtgc tcgcgatctg ctcgtctgga tgcaagcgcg gcacjgtggat 9240 

gacgtcatgc tggccgacgt tgcttatacg ctgcacttgg gccgcgtcgc gatggagcaa 9300 

cgcctggctt ttaccgctgg ctcggctgcc gagttgagcg agaaattaca ggcttacctg 9360 

ggccatgcga ttcgggccga catctatctg agcgaggaca cgcccggcaa accggcaggc 9420 

gctccgatcg tggccgagga agatcbgctc acgctgatgg atgcctggat cgaaaagggc 9480 

cagtacggtc gtttgctgga gtactggacc aagggccaac cgatcgactg gaacaaacnc 9540 

IS tattggcgca agctgtakgc ggacggacgg ccgcggcgga tcagcctgcc cacctatccg 9600 

ttcgagcacc ggcgttattg gcaaacgccg gtgccgggcg agcgaagcct gcacgccacc 9660 

gcgccagcta ctcgggaaac ggttgcggtt ggtgccatgc cggatccggc cggcgctacg 9720 

20 

gtgcaagccc ggttgtgcgc cttgtgccaa gtgttgttgg gcaaaccggt cacggcccag 9780 

atggatttct ttgccgtcgg cggccattcg gtgctggcga tccaattggt ctcgcgcatc 9840 

25 cgcaaaagct tcggggtgga gtatccggtc agcgctttgt tcgaatcggc gctgttgtcg 9900 

gacatggcgc ggcagatcga. acaattgcgg gtgaacggag tcgccaagcg catgccggcg 9960 

ttgttgcctg ccgggcgcgt gggcgcgatt cctgcgactt atgcacagga gcgcctatgg 10020 

30 

ctcgtccacg aacatatgag tgagcaacgc agtagttaca acaCcacctt tgccatgcac 10080 

ttcagaggcg tcgacttccg tgctgaagcg atgcgtgccg cattgaacgc gctggtggtg 10140 

35 cggcacgaag tgctgcgcac acgctttctt tcggaggacg ggcagctgca acaggtgatc 10200 

gctgcctcgt tgacgttgga ggtgccggta agagagatgt cggtcgagga ggtcgacctg 10260 

ctgctggccg cgagcacgcg ggagactttc gatctgcggc aggggccctt gttcaaggca 10320 

cgcatcctgc gcgtggcggc cgatcaccat gtggtgttga gcagcatcca ccacatcatt 10380 
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tccgacggct ggtcgctggg agtgttcaac cgtgacctgc accagctgta cgaggcgtgt 10440 
ttgcgcggca cgccccccac actgccgacg ctggcggtgc agtatgccga ctacgcgctg 10500 
tggcaacggc aatgggagct ggcggctccg ctgtcgtact ggaogcgggc actggaaggc 10560 
tacgacgacg gcctggactt gccctacgac cggccgcgcg gcgccacgcg ggcgtggcgg 10620 
gcagggctgg tcaaacaccg ctatccgccg caactggccc agcagttggc ggcctacagc 10680 
caacagtacc aagcgacgct gttcatgagc ctgctggcag gcctggcgtt ggtgctgggc 10740 
cgttacgccg atcgcaagga cgtgtgcatc ggcgcgacgg tctccggccg cgaccagctg 10800 
gagctggaag agctgatcgg ctttttcatc aatattttgc cgctgcgggt ggacctgtcg 10860 
ggggatccgt gcctggagga ggtgctgctg cgcacgcgtc aagtggtact ggatggcttc 10920 
gcgcaccagt cggtgccgtt cgagcacgtg ttgcaggcgc tgcggcgtca gcgcgacagt 10980 
agccagatcc cgctggtgcc ggtgatgctg cgacaccaga acttcccgac gcaggagatt 11040 
ggcgattggc ccgagggagt gcggctgacg cagatggagc tggggctgga ccgtagcacg liioo 
25 ccgagcgagc tggattggca gttctacggc gacggcagct cgctggagct gacgctggaa 11160 

tacgcgcagg acctcttcga cgaagcgacg gtgcggcgga tgatcgcaca ccaccagcag 11220 
gcgttggagg cgatggtgag ccggccacag ctgcgggtgg gcaagtggga catgctgacg 11280 
gccgaagagc gccggctgtt tgccgcgcta aatgcgacag gtacgccacg ggagtggccc 11340 
agtctggcgc agcagttcga acggcaggcg caggcgacgc cgcaggccat cgcgtgcgtg 11400 
agcgatgggc agtcgtggag ctatgcgcag ttggaggcgc gcgccaacca gctggcacag 11460 
gcgctgcggg ggcagggcgc gggccgggac gtgcgggtgg cggtacagag tgcgcgcacg 11520 
ccggaactgc tgatggcctt gctggcgatc tttaaggccg gtgcgtgcta tgtgccgatc 11580 
gatccggcct acccggcggc ctaccgcgag ce^gatcctgg ccgaggtgca ggtgtcgatc 11640 
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gtgctggagc aagacgagct ggcgctggac gagcaagggc agbtccacaa tccgcgttgg 11700 

cgcgagcaag ccccgacgcc gctggggctg agggagcaCc cgggcgacct ggcgtgcgtg 11760 

atggtgacct ccggctcgac cggccggccc aagggcgtga tggtgccgta tgcgcagctg 11820 

tacaactggc tgcatgcagg ctggcagcgt tctccgttcg aggccgggga gcgggtgctg 11880 

cagaagacct cgatcgcctt tgcggtgtcg gtaaaggagt tgctaagcgg gctgctggcg 11940 

ggggtggaac aggtgatgct gccggacgag caggtgaagg acagcctggc gttggcgcgg 12000 

gcgattgagc aatggcaggt gacgcggctg tacctagtgc catcgcacct gcaggcgctg 12060 

ctggacgcga cgcaaggacg agacgggcta ctgcactcgc tgcgtcacgt ggtgacggcg 12120 

ggggaagcgt tgccgtctgc ggtgcgcgaa acggtgcggg cgcgtctgcc acaggtgcag 12180 

ctatggaaca actacggctg cacggaactg aacgacgcga cctaccaccg gtcggatacg 12240 

gtggcgccag gaacgtttgt gccgatcggc gcaccgatcg ccaacaccga ggtatacgtg 12300 

ctggaccggc agctgcggca ggtgccgatc ggggtgatgg gcgagctgca cgcacacagc 12360 

gtggggatgg cgcgcggcta ctggaaccgg ccggggctga cggcctcgcg cttcatcgcg 12420 

cacccgtata gcgaggagcc gggcacacgg ctgtacaaga ccggtgacat ggtacgccgg 12480 

ctggcggacg ggacgctgga atacctgggc cgacaggact: tcgaggtcaa ggtgcgcggc 12540 

caccgggtgg atacgcggca ggtggaggcg gccbtgcggg cgcagcccgc ggtggccgag 12600 

gcggtggtga gcggtcaccg ggtggacggg gacatgcagt tggtggccta tgtggtggcg 12660 

cgtgaagggc aggcaccgag cgcgggcgag ttgaaacaac agctgtcggc gcagttgccg 12720 

acctacatgc tgccgaccgt gtaccagtgg ctggagcagt tgccgcggct gtccaacggc 12780 

aagttggacc ggttggccct gccggcaccg caggcggtac acgcgcagga gtacgtcgcg 12840 

Gcacgcaacc aggccgagca acggctggcg gcactgtttg ccgaggtgct gcgggtggag 12900 
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caggtaggca tccacgacaa cttcttcgcc ttgggtgggc actcgctgtc tgcatcgcaa 12960 

ctgatctcgc gtattgccag ggatatggcg atcgatctgc ccctggccat gctgttcgag 13020 

ctgcccacgg tagcgcagct tagcgaatcc ctcgccagcc atgcacgcga cagcgattac 13080 

gatgtcatcc ccgcaagcac cgaggaggcg accattccgc tttccactgc gcaggagcgc 13140 

atgtggttcc tgcacaagtt cgtgcaggag acgccgtaca acaccccggg tctcgcctta 13200 

ttgcaaggcg aactggacat ttcggccttg caggtagcat ttcgctgtigt gctagaacgg 13260 

cacgccgtigc tgcgtaccca tttcgtggaa accgagcagc aatgcgtaca ggtcattggc 13320 

gcagcagagc agttcgtgct gcagcttagg tcgattcgcg acgaggctga tctgcatggc 13380 

ctabtgcaca cagccgtcag cgaacccttc gatttagaac gcgagctgcc attgcgcgcc 13440 

ctgctgtatc gcctggacga ccggcggcat tacctagcag tggtcatcca tcacatcgtc 13500 

ttcgacggct ggtcgacctc aatcctgttt cgtgagctgg ccacgcacta tgctgcatgc 13560 

cgccatggcc aatccgcgcc tttgccaccg ctggagctta gctatgccga ttacgcacgc 13620 

tgggagcgtg cgaggctgaa ccaggaagac gcgctgcgca agctcgaata ttggaaaacg 13680 

cagcttgccg atigcaccgcc gctggtgttg cccacgacct atgcgcggcc ggttttccag 13740 

aacttcaatg gcgcgactgt ggcgcttcag atcgagccgc cgctgctgca acgcctgcag 13800 

cgtttcgccg acgcacacag ctttacattg tacatgctac ttctggcagc actgggcgtc 13860 

gtattgtcgc gccatgcccg gcagaagcat ttctgcattg gcagtccggt cgccaatcgc 13920 

gcccgagccg agttgcacgg tttgatcggt ttgttcgtca acaccctggc ggtacggctc 13980 

gatttggacg gcaatcccag cgtgcgcgag ctgctcgaac gcatccactg caccacgctg 14040 

gccgcctacg agcaccagga tgtgccgtcc gaaagaatcg tggaaagcct gaaggtaccg 14100 

cgcgataccg cgcgtaaccc gctggggcag gtgatgctca atttccagaa catgccaatg 14160 
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tcggcgttcg acctggatgg tgtccaggtg caggtgctcc ccatgcacaa cggcacggcc 



14220 



aagtgcgagc tgaccttcga cctgctgctg gatggctcac gcctatccgg tttcgtcgaa 14280 

tacgccactg ggctgttcgc gccggaatgg gtccaggcgc tggtacagca attcaagtgt 14340 

gtgctggcgg cattggtgga acggccggag gcatcgctga atgatttgcc catggcgccc 14400 

aacgaggcgc aaccggcgtc gccggcattg atgaagcatg tcgcgccgag cttgcccaac 14460 

ttacttgagg ctatggcggc caatgatgcc gcacgcctcg ccttgcaagc gccggaaggt 14520 

gcgctcagtt acgctcagct aatcgaggca gcaaacgagt tcgcctggcg tttgcggtgc 14580 

gagcacgccg gtccggacaa agtcgttgcc ctgtgcctag cgccttgctc cgccttggtg 14640 

gttgctttgc tggccgcttc attatgcggt gcggcgagcg tgctgatcga tccgacgacg 14700 

actgccgagg cgcaatacga ccagttgttc gaaacgcggg ccggcatcgt ggtgacctgt 14760 

tctagcttgc tggagaagtt gccgctcgac gaccaggctg tagtgctgat cgacgagcaa 14820 

gctgcagaag cgacgccgcg tttgatgcat ttcaccgacg atccagcttt gcccgcaatg 14880 

ctgtattgtg tgtgtgacga aaaggggoga acccgcacga tcatggtcga aagcggcagt 14940 

ttgtcgagtc gcctgctcga tagcgtgcag cgtttcagtc tcgaacgcac cgatcgcttc 15000 

ctgctgcgca gcccgctttc tgccgaactg gcgaataccg aagtactgca atggttggcg 15060 

gcaggcggca gcctcagcat cgcacccatg catggcgatt tcgatgccgc tgcctggctg 15120 

gagaccctcg cgacgtacgc gatcaccgtg gcctacctgg ctcaagttga attgaccgag 15180 

atgctggcgc atctgcaaaa ccatcctctt gagcgcaaca agctggccgg cttacgcgtg 15240 

ctggtggtgc atggcgcgcc cttgccgatc gcgccactga tgcgcctaga cgcgtggttg 15300 

cgagaggtgg gcggttccgc acggatcttc gccgcctacg ggaatgccga gttcggtgcc 15360 

gaaatattga gccaggatgt cagcgctgca ttgcaagcgg gtattggcgc tcaatacaag 15420 
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catcgccgtg gtctgttccc gttgggtgcc aactcgatgt gtcacgtggt gcagagcaac 15480 

ggccgcatcg cgcQcgacgg catggttggt gaattgtgga tcacacagcc agcctgcttg 15540 

tacaaaaccg atgcattggt gcgtcgcctg gcaaatgggc aactggaatg gttgggctcc 15600 

ctcgatgtcc agtcgcgtat cgatgatccc cgcatcgatc tgtgcgtcgt ggaggcacaa 15660 

ctgcgcttgt gcgaagacgt cggcgaagcg gtagtgctgt atgagccgtt gaagcgctgc 15720 

ttggtagcct atctctcggc ccgtagcaca gctgcaatca tgaccgacga gacgctggcc 15780 

aggatccgcc aggccctgag cgaaaccttg ccggattatc tactgcctgc aakctgggtg 15840 

15 ccgctcgcgc actggccacg cttaccccat gggcgggtcg acctcggcgc cttgcctgca 15900 

ccggatttcg atcttgcgcg gcatgagtcg tacatagcgc cacgcacagc cgtcgaacag 15960 

gccgtggccg aaatatggca acgcgtgttg aagcgtaccc aggtcggcgt gcatgacaat: 16020 

20 

ttcttcgagc tgggcggcca ttcggtgctg gcgatccagc tggtgtccgg cttigcgcaag 16080 

gctttggcca tcgaagtgcc ggtcaccctg gtgttcgagg cgccgatact gggggcgctg 16140 

25 gcgcggcaga tcgccccctt gttggtcagc gaacggcgtc cgcgcccgcc tggcctgacg 16200 

cgcctggagc atacagggcc gattccggct tcgtatgcac aggagcggtt atggctggtg 16260 

cacgagcaca tggaggagca gcgaaccagc tacaacatca gtaacgcagc gcatttcatc 16320 

30 

ggagcagcct tcagcgbcga agcgatgcgt gccgcattga acgcgctggt ggcgcggcac 16380 

gaagtgctgc gcacacgctt tctttcggag gacgggcagc tgcaacaggt gatcgctgcc 16440 

35 tcgttgacgc tggaggtgcc ggtacgcgag gtgtcggccg aggaggtcga cctgctgctg 16500 

gccgcgagca cgcgggagac tttcgatctg cggcaggggc ccttgttcaa ggcacgcatc 16560 

ctgcgcgtgg cggccgatca ccatgtggtg ttgagcagca tccaccacat catttccgac 16620 

ggctggtcgc tgggagtgtt caaccgtgac ctgcaccagc tgtacgaggc gtgtttgcgc 16680 
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ggcacgcccc ccacactgcc gacgctggcg gtgcagtatg ccgactacgc gctgcggcaa 16740 

cggcaatggg agctggcggc tccgctgtcg tactggacgc gggcactgga aggctacgac 16800 

gacggcctgg acttgcccta cgaccggccg cgcggcgcca cgcgggcgtg gcgggcaggg 16860 

ctggtcaaac accgctatcc gccgcaactg gcccagcagt tggcggccta cagccaacag 16920 

taccaagcga cgctgttcat gagcctgctg gcaggcctgg cgttggtgct gggccgttac 16980 

gccgatcgca aggacgtgtg catcggcgcg acggtctccg gccgcgacca gctggagctg 17040 

gaagagctga tcggcttttt catcaatatt ttgccgctgc gggtggacct gtcgggggat 17100 

ccgtgcctgg aggaggtgct gctgcgcacg cgtcaagtgg tactggatgg cttcgcgcac 17160 

cagtcggtgc cgttcgagca cgtgttgcag gcgctgcggc gtcagcgcga cagtagccag 1722 0 

atcccgctgg tgccggtgat gctgcgacac cagaacttcc cgacgcagga gattggcgat 17280 

tggcccgagg gagtgcggct gacgcagatg gagctggggc tggaccgtag cacgccgagc 17340 

gagctggatt ggcagttcta cggcgacggc agctcgctgg agctgacgct ggaatacgcg 17400 

caggacctct tcgacgaagc gacggtgcgg cggatgatcg cacaccacca gcaggcgttg 17460 

gaggcgatgg tgagccggcc acagctgcgg gtgggcaagt gggacatgct gacggccgaa 17520 

gagcgccggc tgtttgccgc gctaaatgcg acaggtacgc cacgggagtg gcccagtctg 17580 

gcgcagcagt tcgaacggca ggcgcaggcg acgccgcagg ccatagcatg cgtgagcgat 17640 

gggcagtcgt ggagctatgc gcagttggag gcgcgcgcca accagctggc acaggcgctg 17700 

cgtgggcagg gcgcgggccg ggacgtgcgg"^gtggcggtac agagtgcgcg cacgccggaa 17760 

ctgctgatgg ccttgctggc gatcttcaag gccggtgcat gctatgtgcc gatcgatccg 17820 

gcctacccgg cggcctaccg cgagcaaatc ctggccgagg tgcaggtgtc gatcgtgctg 17880 

gagcaaggcg agctggcgct ggacgagcaa gggcagttcc gcaatcggcg ttggcgcgag 17940 



1 56 AppUcatioD of Royer, et al 

caagccccga cgccgctggg gctgagggga catccgggcg acctggcgtg cgtgatggtg 18000 

acctccggct cgaccggccg gcccaagggc gtgatggtgc cgtatgcgca gctgcacaac 18060 

tggctgcatg caggctggca gcgttctgcg ttogaggccg gggagcgggt gctgcagaag 18120 

acctcgatcg cctttgcggt gtcggtaaag gagttgctaa gcgggctgct ggcgggggtg 18180 

gggcaggtga tgctgccgga cgagcaggtg aaggacagcc tggcgttggc gcgggcgatc 18240 

gagcaatggc aggtgacgcg gctgtaccta gtgccgtcgc acctgcaggc gctgctggac 18300 

gcgacgcaag gacgcgacgg gctactgcac tcgctgcgtc acgtggtgac ggcgggggaa 18360 

gcgttgccgt cggcggtggg cgaagcggtg cgggtgcgcc tgccacaggt gcagctatgg 18420 

aacaactatg gctgcacgga actgaacgac gcgacctacc atcggtcgga tacggtggcg 18480 

ccaggaacgt ttgtgccgat cggcgcaccg atcgccaaca ccgaggtata cgtgctggac 18540 

cggcagctgc ggcaggtgcc gatcggggtg atgggcgagc tgcacgtaca cagcgtgggg 18600 

atggcgcgcg gctactggaa ccggccgggg ctgacggcct cgcgcttcat cgcgcacccg 18660 

tatagcgagg agccgggcac acggctgtac aagaccggtg atatggtacg ccggctggcg 18720 

gacgggacgc tggaatacct gggccgacag gacttcgagg tcaaggtgcg cggccaccgg 18780 

gtggatacgc ggcaggtgga ggcggccttg cgggcgcagc ccgcggtggc cgaggcggtg 18840 

gtgagcggtc accgggtgga cggggacatg cagttggtgg cctatgtggt ggcgcgtgaa 18900 

gggcaggcac cgagcgcggg cgagttgaaa caacagctgt cggcgcagtt gccgacctac 18960 

atgctgccga ccgtgtacca gtggctggag cagttgccgc ggctgtccaa cggcaagttg 19020 

gaccggttgg cgctgccggc gccgcaggtg gtacacgcgc aggagtacgt cgcgccacgc 19080 

aacgaggccg agcaacggct ggcggcactg tttgccgagg tgctgcgggt ggagcaggtg 19140 

ggcatccacg acaacttctt: cgccttgggt gggcactcgc tgtctgcatc gcaactgatc 19200 
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tcgcgcatcc gccaaagttt tcacgtcgat ctgccgctga gccggatctt cgaggcaccc 19260 

acgatcgagg gcctggtcag gcagctagcg ttgcctagtg aaggcggcgt ggccagcatc 19320 

gccagggtag cgcgaaaccg gacgatccca ttgtcgctgt tccaggaacg cctgtggttc 19380 

gtgcaccaac acatgcctga gcaacgcacc agttacaacg gcacgctcgc cttgcgtttg 19440 

cgtggtcctt tgtcggtgga agcgatgcgt gcagcgctgc gtgcgttagt gctgcgccac 19500 

gaaatcttgc gtacccgctt cgtgttgccg accggtgcta gcgagccggt gcaggtcatt 19560 

gacgagcaca gcgattitcca gctctcagta cagctagtcg aggatactga gatcgcgtcg 19620 

ctgatggatg aactggcaag tcatatctac gacttagcca acggcccgct gttcattgca 19680 

tgccttttgc aactggatga gcaagaacat gtgctgctaa tcggcatgca tcaccttatc 19740 

tacgacgctt ggtcgcaatt caccgtgatg aaccgcgatc tacgcgtgct gtatcaccgc 19800 

. cacctcggac ttgccggcgg agatccgccg gaattaccga tccaatatgc cgactatgcg 19860 

atctggcaac gcgcccagaa cctggacgcg caactggcct attggcaggc tatgttgcac 19920 

25 gactacgacg acggcctgga gctgccctac gactatccgc gtccgcgcaa tcgcacctgg 19980 

cacgcagcgg tctacacaca cacctatccg gctgaactgg tacagcgctb cgccggcttc 20040 

gtacaggcgc atcagtcgac cttgtt:catc gggctgttgg ccagcttcgc ggtcgtgttg 20100 

30 

aacaaataca ccggccggga cgacttgtgc atcggtacca ccacggcagg gcgcacgcac 20160 

ctggagctgg agaacctgat cggtttcttc atcaacatct tgcctttgcg cttgcgcttg 20220 

35 gacggcgatc cggacgttgc cgaaatcatg cggcgaacac ggttggtggc gatgagcgcg 20280 

tttgagaacc aggcgctacc gttcgagcac ctgctcaacg ccctgcacaa gcaacgtgac 20340 

accagccgga ttccgctagt tccggtggtg atgcgtcatc agaacttccc ggacacgatc 20400 

ggcgactgga gcgatggcat ccgtaccgaa gtgatccagc gcgatctgcg tgccaccccc 20460 



1 



} 5g Application of Royer, et al. 

aatgaaatgg acctgcaatt cttcggcgac ggtacggggc tttcggtcac agtggaatac 20520 
gcggcggagc tgttctcaga agcgaccatt cgccgcctga tccaccatca ccaactcgtc 20580 
ctggagcaga tgtitggcggc ccatgaaagc gccacgtgcc ccttggatgt: tgccgactag 20640 



<210> 21 

<211> 1032 

10 <212> DNA 

<213> Xanthomonas albilinesuis 

<400> 21 

atggattcag cgttacctac atctgcattt: accttcgatc tcttttacac cacggttaac 60 



15 

gcctactatc gcactgccgc agtcaaggcg gcgatcgaac tggggctatt cgatgtggtg 120 

I 

gggcagcagg gccgaactcc cgcagccatc gccgaggcct gccaggcgtc gccgcgcggc 180 

20 attcgcatcc tttgctatta cctagtatcg atcggttttc tacgccgcaa cggtggcctg 240 

ttctacatag atcgcaacat ggccatgtac ctggatcgta gttcgcccgg ctacctgggt 300 

ggcagcatca agttcctgct ctcgccctac atcatgagcg ccttcaccga tctgaccgcc 360 

25 

gtagccagga ccggcaagat caacctggcg caggacggcg tggtggcacc ggatcacccg 420 

cagtgggtgg aatttgcacg cgcgatggca ccgatgatgg cgctgccctc ggcgttgatc 480 

30 gccaatatgg tgtcgttgcc cgctgatcgg ccgattcgtg tgctggacgt ggcagccggc 540 

cacggcctgt tcggcatcgc cttcgcgcag cgcttccgcc aggctgaagt gagcttcctg 600 

gactgggaca acgtgctaga cgtagcacgc gaaaacgccc aggcggccaa agtggccgag 660 

35 

cgagcgcgtt tcctgcccgg caacgcattc gacctcgatt acggcagcgg ctacgacgtg 720 
f 

atcttgttga ccaacttcct gcaccatttc gatgaggtcg atggcgagcg catcttggct 780 

40 aagacgcgcg atgcgctgaa cgacgacggc atggtgatca ctttcgaatt catcgccgac 840 



1 59 Applicauon of Royer, et al. 

gaagagcgtt cctcaccgcc gctggccgcc accttcagca tgatgatgct gggcaccacc 900 

ccggcgggcg agtcctacac ctatagcgat ctggaaagga tgtttcggca tgccggcttc 960 

S ggccacgtgg aactataaatc gataccgccg gccttgctga aagtggtggt ttcccgcaag 1020 

agggccccat aa 1032 

10 <210> 22 

<211> 504 
<212> DHA 

<213> Xanthoxnonas albilineans 
IS <400> 22 

atgatcgaat cggcgacatc ccctgtggcg aaaaccgagc gcatctggtg caccgagctg 60 

gacctggatg cactcaacgc catgtcggcc aacacgatgc aggccctgct cggtatacgc 120 

20 atgatcgaga tcggctcgga ctatctggtc tcctgcatgt cggtggactg gcgttgccac 180 

cagccctatg gggtattgca tggcggcgca tcggtcaccc tggccgaggc taccggcagc 240 

atggcggcct ccatgtgcgt gccggccggc caacgttgcg ttggcctaga catcaatgcc 3 00 

25 

aaccacatcg cgagcatctc cagtggccaa gtacagtgca tcgcgcggcc gctgcacata 360 

ggggccttga cccaggtatg gcagatgcgc atctatgacg aaggtgaccg cacgatctgc 420 

30 gtgtcgcgcc tgaccatggc ggtattatcg gtgcacgtcg cgcgcgtatc cccgaatcca .480 

gccagcagcg gagtccagac gtga 504 



35 <210> 23 

<211> 2826 
<212> DNA 

<213> Xanthomonas albilineans 
40 <400> 23 

gtgaacgaaa ctgcaactgt aaccaaggct accctcagtt cagcgaaggc gagtataacg 60 



10 



1 50 Application of Roycr, et al 

ccagcctgcg ttcaccaatg gtttgaagcg caggtgagtt cgacaccgga tgcgcctgct 120 

gccttcttag gcgagcgtcg aatgagttat ggccagctca acacccgcgc caatcggctt 180 

gcacggctgt tgcagtcaca gggcgttggg cctggtgccc gggtcgcggt gtggatgaat 240 

cgcagccccg aatgcctggc cgctttgctg gcggtcatga aggccggggc agcttatgta 300 

ccgatcgacc tgagcctgcc gatccgacgt gtccaataca tcttgcagga cagccaggcc 360 

cggctcgtac tggtcgatga cgaagggcaa ggccgcctgg acgaacttga gctgggcgcg 420 

atgactgccg tcgatgtctg cggcactctg gacggcgacg aggcgaatct ggacctgcct 480 

IS cgcgatccgg cgcagccggt ttattgcatc tatacctccg gctccacagg tagccccaag 540 

ggcgtgctgg tacggcacag cgggttggct aactacgtgg cccgggctaa gcggcaatac 600 

gttacggctg acacgacgag tttcgccttt tactcctcgc tgtcgttcga tctgaccgtc 660 

20 

acctcgatct acgtgcccct ggtggctggc ctgtgtgtgc atgtgtaccc ggagcagggc 720 
gacgacgtgc cggtaatcaa ccgcgtgctg gacgacaacc aagtagacgt gatcaagctg 760 
25 acaccctcgc acatgctgat gctgcgcaac gcggcactgg cgacgtctcg gctgaagacg 840 

ctgatcgtgg gtggcgagga cctgaaagcg gcggtggcgt acgacatcca tcagcggttc 900 
cgccgcgatg tggcgatcta caacgaatac ggtcctaccg aaaccgtagt ggggtgcgcg 960 

30 

atccatcgtt acgatccggc gaccgaacgc gaaggctcgg tgccgattgg tgtgccgatc 1020 

gatcacacca gcctccacct gctcgatgaa cgtctgcagc cggtcgcacc gggcgaggtc 1080 

35 ggccagatcc acatcggtgg cgcgggcgtg gccatcggct atgtgaacaa gccggagatc 1140 

accgatgcgc aattcattga caatcccttc gaaggcagcg gccggcttta cgccagtggc 1200 

gacctaggac gcatgcgtgc cgacggtaag cttgaattcc ttggccgcaa ggattcgcag 1260 

atcaagctgc gcggctaccg catcgaactg ggcgagatag agaacgttct gcttggccac 1320 



40 





l^-S&iS a -J. Sli Jl 3 JIS 
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Application of Royer, et al 



gcagccttgc gcgaatgcat cgtggatacc accgtggcgc cgcgccgcga cbatgacagc 
aagagcttgc gctattgcgc gcgctgcggt atcgcttcaa atttccccaa taccagcttc 
gacgagcacg gtgtctgcaa ccattgccac gcctacgaca aataccggaa cgtggtcgag 
gattatttcc ggaccgaaga tgagctacgt actatcttcg agcaggtcaa ggcgcacaac 
aggctccgct acgactgcct ggtggctttc agcggcggca aggacagcac ctatgcgcta 
tgccgcgtag tggacatggg cctgcgcgtg ttggcgtaca ccctggacaa tggctacatc 
tccgacgagg ccaaggcaaa cgtcgaccgc gtcgtgcgcg agctgggggt ggaccatcgc 
tatctgggta ctccacacat gaacgccatc ttcgtggaca gcctgcatcg ccacagcaac 
gtctgcaacg gctgcttcaa gaccatctat acgctgggta tcaacctggc gcacgaagtg 
ggcgtaagcg acattgtaat gggcctgtcc aaaggacagc tgttcgagac gcgcctgtct 
gagctgtttc gcgccagcac cttcgacaac caggtatttg agaagaacct gatggaggcg 
cgcaagatct accatcgcat cgacgacgcg gcggcccgcc tgctggacac ctcttgcgtg 
cgcaacgatc gcttgctcga aagtacgcgt ttcatcgact tctaccgcta ctgcagtgtc 
agccgcaagg acatgtatcg ctatatcgcc gagcgcgtag gctggagccg tccggctgac 
accggccgct cgactaactg cctgctcaac gatgtgggca tctacatgca caagaagcaa 
cgtggctatc acaactattc gttgccctac agttgggacg tgcgggtagg ccatatccca 
agggaagacg cgatgcgcga gctggaggac accgacgata tagacgaggc caaggtactg 
ggcctgctca agcagaccgg ctatgactca agcctgatcg atacccaggc gggcgatgcg 
cagctgatcg cctactacgt ggcggcggag gaactggatc cggtggcatt gcgcaatttt 
gctgctgcga tcttgcccga gtacatgctg ccttcgtatt tcgtgcggct ggaccgaatg 
ccgttgacgc cgaatggcaa ggtgaaccgc cgagcattgc cgaggccgga gttgaagaag 
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aacgccagcg aggcgcatac cgagccgagc agtgcgctag agcaggaact ggtgcaaatc 2640 



ggccactcgc tgagcgcgct gatgttgctc tacagcatag ccgagcgcta ccagaagatg 2760 
gtcagcatcG aggcattctc ggttaatccg accatcgaag gtctgtcgga gcatctggtc 2820 



<:210> 24 
<211> 837 
<212> DNA 

<2X3> Xanthomonas albilineans 
<400> 24 

atgcccaatg ccgtaccgat gcagggcgcg cggggactcc cgcagccgca agcgatigaac 60 

ccagggttgc cgagcgtcgg cggcttgagc gcaggccagc cattgcagtt gtcgttagca 120 

ccggaactgc aggcagccgc gcgcagtgcc caccgccatc tgctcgacga cggcacggcg 180 

ctttacctgc tggcgttcga taccgcgcaa ttcgacccgg gggctttcgc ggcaatggca 240 

atcgcccgcc cggacagcat cgcccgcagc gtgcgcaagc. gtcaggccga gttcctgttc 300 

ggccgtctgg ccgcgcgact ggcgcbgcaa gaggtgctgg gacctgcgca agcgcaggca 360 

.gacattgcaa tcggcgcgac gcgcgcgccc tgctggcctg ccggcagcct gggcagcatt 420 

tcccattgcg aggactacgc ggccgccatc gccatggcgg ccggcacccg ccacggcgtg 480 

ggcatcgatc tggaacgacc aatcacaccc gcggcgcgcg cggcgttgct gagcatcgca 540 

atcgatgccg acgaagccgc tcgtctggca aaggcggcag acgcgcagtg gccgcaagac 6po 

Gtgctgctga ccgcactatt ttcggccaag gaaagcctgt tcaaagccgc ctacagcgcg 660 

gtcggacgct acttcgactt cagcgcggca cgcctgtgcg gcatcgacct ggcacggcaa 720 



tggaaagagg tgctgatggt cgacaaggtc ggcgtcaggg acaacttttt cgagctgggc 



2700 



gcataa 



2826 
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tgcctgcatc tgcgcctgac cgagacactc tgcgcgcaat tcgtggccgg gcaagtgtgc 



780 



gaggtcggct tcgcgcgcct. accaccggac ctggtgctca cccactacgc ctggCga 



837 



<210> 25 
<211> 1905 
<212> DNA 

<213> Xanthomonas albilineans 
<400> 25 

atgagcgtgg aaacccaaaa agaaaccctg ggctttcaga ccgaggtcaa acagctgctg 60 

cagctgatga tccattcgtt gtattccaac aaggagatct tcctgcgcga gctgatctcc 120 

aatgcctccg acgcggccga caaactgcgc ttcgaggcac tggtcaagcc ggaacttctg 180 

gacggcgatg cgcaactgcg catccgcatc ggcttcgaca aggacgccgg caccgtcacc 240 

atcgacgaca acggcatcgg catgagccgc gaggagatcg tcgcgcacct gggcaccatc 300 

gccaaatccg gcacctccga tttcctcaag catctgtccg gcgatcagaa gaaggattcg 360 

cacctgatcg gccagttcgg tgtcggcttc tacagtgcct tcatcgtcgc cgatcaagtg 420 

gacgtgtaca gccgtcgcgc cgggctgccg gccagcgacg gcgtacactg gtcctcgcgt 480 

ggcgaaggcg agttcgaggt cgccaccatc gacaagcccg agcgcggcac ccgcatcgtg 540 

ctgcacttga aggaggaaga gaaaggcttc gccgacggtt ggaagttgcg cagcatcgtg 600 

cgcaagcact ccgaccacat: cgccttgccg atcgagctaa tcaaggaaca ctacggcgag 660 

gacaaggaca agccggaaac ccccgagtgg gagaccgtca atcgcgccag cgcgctgtgg 720 

acacggccgc gcaccgagat caaggacgag gaacaccaag aactgtacaa gcacattgcc 780 

cacgaccacg aaaacccggt ggcgtggagc cataacsiagg tcgaaggcaa actggaatac 840 

acctcgctgc tgtacctgcc cggccgcgcc ccgttcgacc tgtaccagcg cgatgcctcg 900 



^ ^ O U D-i. 
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10 



15 



20 



25 



30 



35 



cgcgggctca 
ccgctgtacG 
tcgcgcgaaa 
cgcgcactgg 
tggaagaact 
aagatcgccg 
tcgctggccg 
ggggaaagct 
atcgaggtgc 
ttcgacagca 
agcgaagaag 
cgcatccagc 
gattcgccgg 
ctggaagcca 
catccgcbga 
cgggtgetgt 
tacgtgcgtc 



agctgtacgt gcagcgcgtc 
tgcgcttcat caagggcatc 
tccbgcaatc tggtccggtg 
acatgctgga aaagctcgcc 
tcggccaggt gctgaaggaa 
gcctgctgcg cttcgcgtcc 
actacgtggc gcggatgaaa 
acgcgcaaat caaggacagc 
tcctgctcac cgaccgcatc 
aatccbtcgt cgatgtggcg 



ttcatcatgg accaggccga ccaattcctg 960 

gtcgattcca gcgacctgcc gctgaacgtc 1020 

atcgactcga tgaagtcggc gctgaccaag 1080 

aaagacgatc ccgaacgcta caagggcgtg 1140 

ggtccggccc aggacttcgg caaccgcgaa 1200 

acccacagcg gcgacgacgc ccagaacgtg 1260 

gacggccagg acaagctgta ctacctgacc 1320 

ccgcacctgg aggtgttccg caagaagggc 1380 

gacgagtggc tgatgagcta tctcaccgag 1440 

cgcggcgacc tggacctggg caagc&ggac 1500 



aaaagcaggc gcaggaagaa gccgccaagg ccaagcaagg gctggccgag IS 60 

aggtactcaa ggacgaggtc gccgaggtgc gggtctcgca ccggctgacc 1620 
cgattcttgc catcggccag ggcgacatgg gtctgcaaat gcggcagatc ^ 1680 

gcgggcagaa gctgccggag agcaagccgg tgttcgagtt caaccccgcg 1740 

tcgagaaact ggatgcggaa cccgatigtcg atcgtttcgg tgatctggcg 1800 

bcgatcaggc cgcgctggcc gccggcgaca gcctcaagga cccggccgcc 1860 

ggctcaacaa gctgttgctg gage tgt egg cgtaa 1905 



40 



<210> 26 

<211> 6879 

<212> PRT 

<213> Xanthomonas albllineans 

<400> 26 



]^g5 Application of Royer. et al. 

Met Pro Asn Ala I-eu Met Gin lie Thr Leu Val Ala Val Gin Phe Ala 
15 10 15 



Gly Val I»eu Leu Gly Val Tlir Ala Arg Ala Ala lie Pro Asn Lys Ala 
20 25 30 



Gly Met Arg Arg Ala Trp Pro Pro Phe Pro Gin Ala Cye Cys Arg Ser 
35 40 45 



lie Ala Tyr Leu Met' Gin Arg Ser Pro Met Ser Pro Leu Gin Gin Thr 
50 55 60 



Leu Leu Thr Arg Leu Ala Ser Ala Ala Ala Ser Arg Thr Met lie Glu 
65 70 75 80 



Phe Pro Arg Pro Glu His Ala Ser Pro Gin Cys Cys Asp Asp Ala Glu 
85 90 95 



Leu Ala Arg Leu lie Val Gin Leu Ser. Ala Gly Leu Gin Pro Leu Ala 
100 105 110 



Met Pro Gly Thr Tyr Val lie lie Ala Ala Pro His Gly Gly Leu Phe 
115 120 125 



Ala Ala Ala Leu Leu Ala Cys Leu His Ala Asn Leu Val Ala Val Pro 
130 135 140 



Phe Pro Leu Asp Val Ala Gin Pro Asn Glu Arg Glu Gin Ala Arg Leu 
145 150 155 160 



Glu Thr He His Ala Gin Leu Met Glu His Gly Asn Val Ala Val Leu 



1 55 Application of Royer. et al. 



165 170 175 



Leu Asp Asp Val Ala Asp Arg Ser Ala Phe Ala Arg Met Ala His Ala 
180 185 190 



Ala Gly Thr Phe Leu Ala Thr Phe Ala Asp Leu Lys TVrg Glu Ser Thr 
195 200 205 



Ser Ala Ser Leu Cys Pro Ala Ser Pro Ser Asp Ala Ala Leu Leu Leu 
210 215 220 



Phe Thr Ser Gly Ser Ser Gly Glu Ser Lys Gly lie Leu Leu Ser His 
225 230 235 240 



T^g Asn Leu His His Gin lie Gin Ala Gly lie Arg Gin Trp Ser Leu 
245 250 255 



Asp Glu His Ser His Val Val Thr Trp Leu Ser Pro Ala His Asn Phe 
260 265 270 



Gly Leu His Phe Gly Leu Leu Ala Pro Trp Phe Ser Gly Ala Thr Val 
275 280 285 



Ser Phe lie His Pro His Ser Tyr Met Lys Arg Pro Gly Phe Trp Leu 
290 295 300 



Glu Thr Val Ala Ala Arg Asp Ala Thr His Met Ala Ala Pro Asn Phe 
305 310 315 320 



Ala Phe Asp Tyr Cys Cys Asp Trp Val Met Val Glu Gin Leu Pro Pro 
325 330 335 



1 57 Application of Royer, et al. 



Ser Ala Leu Ser Thr Leu Thr His lie Val Cys Gly Gly Glu Pro Val 
340 345 350 



Arg Ala Ser Thr Met Gin Arg Phe Phe Glu Lys Phe Ala Gly Leu Gly 
355 360 365 



Ala Arg Thr Gin Thr Phe Met Pro His Phe Gly Leu Ser Glu Thr Gly 
370 375 380 



Ala Leu Ser Thr Leu Asp Glu Ala Pro Gin Gin Arg Val Leu Glu Leu 
385 390 395 400 



Asp Ala Asp Ala Leu Asn Lys Arg Lys Arg Val Ala Ala Gly Ala Ser 
405 410 415 



Gin Ala Arg Val Thr Val Leu Asn Cys Gly Ala Val Asp Gin Asp Val 
420 425 430 



Glu Leu Arg lie Val Cys Pro Glu Gly Glu Thr Leu Cys Arg Pro Asp 
435 440 445 



Glu He Gly Glu He Trp Val Lys Ser Pro Ala He Ala Arg Gly Tyr 
450 455 460 



Leu Phe Ala Lys Pro Ala Asp Gin Arg Gin Phe Asn Cys Ser He Arg 
465 470 475 480 



His Thr Asp Asp Ser Gly Tyr Phe Arg Thr Gly Asp Leu Gly Phe He 
485 490 495 



1 6g Application of Royer, et al 

Ala TVsp Gly Cys Leu Tyr Val Thr Gly Arg Val Lya Glu Val Leu lie 
500 505 510 



lie Arg Gly Lys Asn His Tyr Pro Ala His lie Glu Ala Ser He Ala 
515 520 525 



Ala Thr Ala Ser Pro Gly Ala Leu Met Pro Val Val Phe Ser He Glu 
530 535 540 



Arg Gin Asp Glu Glu Arg Val Ala Ala Val He Ala Val Asn His Pro 
545 550 555 560 



Trp Thr Pro Ala Ala Cys Ala Ala Gin Ala His Lys He Arg Gin Gin 
565 570 575 



Val Ala Asp Gin His Gly Val Ala Leu Ala Glu Leu Ala Phe Ala Glu 
580 585 590 



His Arg His Val Phe Gly Thr Tyr Pro Gly Lya Leu Lys Arg Arg Leu 
595 600 605 



Val Lys Glu Ala Tyr Val Asn Gly Gin Leu Pro Leu Leu Trp His Glu 
610 ' 615 620 



Gly Lys Asn Arg Asp Val Pro Ala Ala Ala Ala Asp Asp Arg Gin Ala 
625 630 635 640 



Gin His Val Ala Asp Leu Cys Arg Lys Val Phe Leu Pro Val Leu Gly 
645 650 655 



Val Ala Pro Pro His Ala Gin Trp Pro Leu Cys Glu Leu Ala Leu Asp 
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660 665 670 



Ser lieu Gin Cys Val Arg Leu Ala Gly Ala lie Glu Qlu Cys Tyr Gly 
5 675 680 685 



Val Pro Phe Glu Pro Thr Leu Leu Phe Lys Leu Glu Thr Val Gly Ala 
690 695 • 700 

10 

lie Ala Glu Tyr Val Leu Ala His Gly Arg Gin Ala Pro Thr Pro Thr 
705 710 715 720 

15 

Arg Ala Pro Val Ala Ser Thr Thr Cys Ser Glu Glu Pro lie Ala He 
725 730 735 



20 Val Ala Met His Cys Glu Val Pro Gly Ala Gly Glu Asn Thr Glu Ala 

740 745 750 



Leu Trp Ser Phe Leu Arg Ser Asp Val Asn Ala He Arg Pro He Glu 
25 755 760 765 



30 



Ser Thr Arg Pro Asp Leu Trp Ala Ala Met Arg Ala Tyr Pro Gly Leu 
770 775 780 



Ala Gly Glu Gin Leu Pro Arg Tyr Ala Gly Phe Leu Asp Asp Val Asp 
785 790 795 800 



35 



Ala Phe Asp Ala Ala Phe Phe Gly He Ser Arg Arg Glu Ala Glu Cys 
805 . 810 815 



40 Met Asp Pro Gin Gin Arg Lys Val Leu Glu Met Val Trp Lys Leu He 

820 825 830 



1 70 Application of Royer, et al. 



Glu Gin Ala Gly His Asp Pro Leu Ser Trp Gly Gly Gin Pro Val Gly 
835 840 845 



I*eu Phe Val Gly Ala His Thr Ser Asp Tyr Gly Glu Leu Leu Ala Ser 
850 855 860 



Gin Pro Gin Leu Met Ala Gin Cys Gly Ala Tyr lie Asp Ser Gly Ser 
865 870 875 880 



His Leu Thr Met He Pro Asn Arg Ala Ser Arg Trp Phe Asn Phe Thr 
885 890 895 



Gly Pro Ser Glu Val He Asn Ser Ala Cys Ser Ser Ser Leu Val Ala 
900 905 910 



Leu His Arg Ala Val Gin Ser Leu Arg Gin Gly Glu Ser Ser Val Ala 
915 920 925 



Leu Val Leu Gly Val Asn Leu He Leu Ala Pro Lys Val Leu Leu Ala 
930 935 940 



Ser Ala Ser Ala Gly Met Leu Ser Pro Asp Gly Arg Cys Lys Thr Leu 
945 950 955 960 



Asp Ala Ala Ala Asp Gly Phe Val Arg Ser Glu Gly lie Ala Gly Val 
965 970 975 



He Leu Lys Pro Leu Ala Gin Ala Leu Ala Asp Gly Asp Arg Val Tyr 
980 985 990 



171 Application of Royer, et al 

Gly Leu Val Arg Gly Val Ala Val Aen His Gly Gly Arg Ser Asn Ser 
995 1000 1005 



Zieu Arg Ala Pro Asn Val Asn Ala Gin Arg Gin Leu Leu lie Arg 
1010 1015 1020 



Thr Tyr Gin Glu Ala Gly Val Glu Pro Ala Ser Val Gly Tyr Val 
1025 1030 1035 



Glu Leu His Gly Thr Gly Thr Ser Leu Gly Asp Pro lie Glu He 
1040 1045 1050 



Gin Ala Leu Lys Glu Ala Phe He Ala Leu Gly Ala Gin Ala Ala 
1055 1060 , 1065 



Pro Ser Asn Cys Gly He Gly Ser Val Lys Ser Ala Leu Gly His 
1070 1075 1080 



Leu Glu Ala Ala Ala Gly Leu Thr Gly Leu He Lys Val Leu Leu 
1085 1090 1095 



Met Leu Lys His Gly Glu Gin Ala Gly Thr Arg His Phe Ser Thr 
1100 1105 1110 



Leu Asn Pro Leu He Asp Leu Arg Gly Thr Ser Phe Glu Val Val 
1115 1120 1125 



Ala Gin His Arg Ala Trp Pro Ser Gin Val Gly He His Gly Thr 
1130 1135 1140 



Leu Leu Pro Arg Tlrg Ala Gly He Ser Ser Phe Gly Phe Gly Gly 
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1145 1150 1155 



Ala Asn Ala His Ala lie Val Glu Glu His Val lie Ala Thr Pro 

1160 1165 1170 

Pro Ser Thr Ser Ser Ala Gly Gly Pro Val Gly lie Val Leu Ser 

1175 1180 1185 



Ala Gly Ser Glu Ala Val Leu Arg Gin Gin Val Leu Ala Leu Ser 
1190 1195 1200 



Ala Trp Leu Arg Gin Gin Ser Pro Thr Pro Ala Gin Met lie Asp 
1205 1210 1215 



Val Ala Tyr Thr Leu Gin Val Gly Arg Ala Ala Leu Ser His Arg 
1220 1225 1230 



Leu Ala Phe Ser Ala Thr Asp Ala Glu Gin Ala Leu Ala Arg Leu 
1235 1240 1245 



Glu Gly Arg Leu Ala Gly Val Met Asp Ala Glu Val His His Gly 
1250 1255 1260 



Val Val Asp Ala Ala Ala Thr Ala Pro Glu His Gly Arg Gin Thr 
X265 1270 1275 



Arg Glu Gly Leu Ala Gly Leu Leu Arg Ala Trp Thr Gin Gly Val 
1280 1285 1290 



Arg Val Asp Trp Ser Ala Leu Tyr Gly lie Gin Arg Pro Gin Arg 
1295 1300 1305 
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Val Ser Leu Pro Val Tyr Pro Phe Ala Arg Glu Arg Tyr Trp Leu 
1310 1315 1320 

5 

Pro Gly Gin Ala Met His Ala Ala Ala Asp Ala His Pro Met Leu 
1325 1330 1335 

0 Gin Leu Leu His Ala Asn Ala Lys Leu His Arg Tyr Ala Leu Arg 

1340 1345 1350 



Arg Ser Gly Cys Ala Ser Phe Leu Val Asp His Cys Val Asp Gly 
15 1355 1360 1365 



Arg Gin Val Leu Pro Ala Ala Val Gin Leu Glu Leu Val Arg Ala 
1370 1375 1380 

20 

Val Ala Gin Arg Val Met Ala Gin Asp Glu Gly Cys lie Glu Leu 
1385 1390 1395 

25 

Ala Gin Val Ala Phe Leu His Pro Leu Met Met Glu Glu Thr Glu 
1400 1405 1410 



30 Leu Glu Val Glu lie Glu Leu Ser Lys Ser Asp Gin Asp Glu Phe 

1415 1420 1425 

. Asp Phe Gin Leu His Asp Ala His Arg Gin Gin Val Phe Ser Gin 
35 1430 1435 1440 



40 



Gly His Val Arg Arg Arg Val Tyr Thr Ala Thr Pro Arg Leu Asp 
1445 1450 1455 

< 



I Application of Royer« et al 

Leu Ala Gin Leu Gin Lys Leu Cys Ala Glu Arg Val Leu Ser Gly 
1460 1465 1470 



Glu Asp Cys Tyr Ala His Phe Thr Ala Cys Gly Leu Gin Leu Gly 
1475 1480 1485 



Asp Arg Leu Lys Ser Val Gin Ser He Gly Cys Gly Arg Asn Gly 
1490 1495 1500 



Glu Gly Glu Pro He Ala Leu Gly Val Leu Arg Leu Pro Pro Ser 
1505 1510 1515 



Ser Val Glu Asp Ser His Val Leu Pro Pro Ser Leu Leu Asp Gly 
1520 1525 1530 



Ala Leu Gin Cys Ser Leu Gly Leu Gin Arg Asp Val Glu His He 
1535 1540 1545 



Ala Met Pro Tyr Thr Leu Glu Arg Met Thr Val His Ala Pro He 
1550 1555 1560 



Pro Pro Glu Ala Trp Val Leu Leu Arg His Gly His Ala Ala Arg 
1565 1570 1575 



Gin Ser Leu Asp He Asp Leu Leu Asp Ser Glu Gly Arg Val Cys 
1580 1585 1590 



. Val Ser Leu Gly Asn Tyr Thr Gly Arg Ala Pro Lys Ala Val Ser 
1595 1600 1605 



Ala Val Arg Ala Leu Val Leu Ala Pro Val Trp Gin Ala Leu Thr 



J 75 Application of Royer, et al 



1610 1615 1620 



Glu Thr Ala Pro Ala Trp Pro Asp Pro Ala Glu Arg lie Val Thr 
1625 1630 1635 



Val Gly Asp Asp Ala Trp Arg Ser His Phe Gly Phe Asp Glu Pro 
1640 1645 1650 



Ala Leu Ser Leu Glu Asp Ser Val Glu Val lie Ala Thr Arg Leu 
1655 1660 1665 



Gly Gin Ser Gly Lys Phe Asp His Leu Val Trp lie Val Pro lie 
1670 1675 1680 



Ala Glu Ser Glu Thr Asp lie Ala Ala Gin Gly Ser Ala Ala lie 
1685 1690 1695 



Ala Gly Phe Arg Leu Val Lys Ala Leu Leu Ala Leu Gly Tyr Ala 
1700 1705 1710 



His Arg Pro Leu Gly Leu Thr Val Leu Thr Arg Gin Ala Leu Thr 
1715 1720 1725 



Arg Gin Pro Ser His Ala Ala Val His Gly Leu lie Gly Thr Leu 
1730 1735 1740 



Ala Lys Glu Tyr Cys Asn Trp Lya He Arg Leu Leu Asp Leu Pro 
1745 1750 1755 



Ser Val Lys Ser Trp Pro Gin Trp Glu Gin Leu Arg Ser Leu Pro 
1760 1765 1770 



25 



1 75 Application of Royer, et al. 



Trp His Ala Gin Gly Glu'Ala lieu lie Gly Arg Gly Thr Cys Trp 
1775 1780 1785 

5 

Tyr Arg Arg Gin Leu Cys Glu Val Leu Pro Leu Pro Ser Leu Glu 
1790 1795 1800 



10 Pro Pro Pro Tyr Arg Val Gly Gly Val Tyr Val Val lie Gly Gly 

1805 1610 1815 



Ala Gly Gly Leu Gly Glu Val Leu Ser Glu His Leu lie Arg Thr 
15 1820 X825 1830 



Tyr Asp Ala Gin Leu lie Trp lie Gly Arg Arg Val Leu Asp Glu 
1835 1840 1845 

20 

Gly lie Ala Arg Lys Gin Thr Arg Leu Ala Ser Leu Gly Arg Ala 
1850 1855 1860 



Pro His Tyr lie Ser Ala Asp Ala Ser Asp Pro Ala Ala Leu Gin 
1865 1870 1875 



30 Ala Ala His Asn Glu lie Val Ala Leu His Gly Gin Pro His Gly 

1880 1885 1890 



Leu He Leu Ser Asn He Val Leu Lys Asp hla, Ser Leu Ala Arg 
35 1895 1900 1905 



40 



Met Glu Glu Ala Asp Phe Arg Asp Val Leu Ala Ala Lys Leu Asp 
1910 1915 1920 



15 



20 



40 



1 77 Appltcahon of Royer, ct al 



Val Ser Val Cys Ala Ala Gin Val Phe Gly Thr Ala Pro Leu Asp 
1925 1930 1935 



S Phe Val Leu Phe Phe Ser Ser He Gin Ser Thr Thr Lys Ala Ala 

1940 1945 1950 



Gly Gin Gly Asn Tyr Ala Ala Gly Cys Cys Tyr Val Asp Ala Phe 
10 1955 1960 1965 



Gly 61u Leu Trp Ala Arg Arg Gly Leu Arg Val Lys Thr He Asn 
1970 1975 1980 



Trp Gly Tyr Trp Gly Ser Val Gly Val Val Ala Gly Glu Aap Tyr 
1985 1990 1995 



Arg Arg Arg Met Ala Gin Lys His Met Ala Ser He Glu Gly Ala 
2000 2005 2010 



25 Glu Ala Met Gin Val Leu Ser Gin Leu Leu Cys Ala Pro Leu Gin 

2015 2020 2025 



Arg Leu Ala Tyr Val Lys He Asp Asp Ala Asn Ala Met Arg Ala 
30 2030 2035 ' 2040 



Leu Gly Val Val Glu Asp Glu Ser Val Gin He Pro Val His Ala 
2045 2050 2055 

35 

Pro Ala Glu Pro Pro Arg Gly Gin Pro Gly Pro Val Val Glu Leu 
2060 2065 2070 



Ser Val Asn Leu Asp Ala Arg Arg Glu Arg Glu Thr Leu Leu Ala 



15 



30 



1 78 Application of Royer, ct at. 



2075 2080 20B5 



Ala Trp Leu Leu Glu Leu lie Glu Gin Leu Gly Gly Phe Pro Pro 
S 2090 2095 2100 



Ala Ser Phe Asp lie Ala Thr Leu Ala Gin Arg Leu His lie Val 
2105 2110 2115 

10 

Pro Ala Tyr Arg Ser Trp Leu Glu His Ser Val Arg Met Leu Gly 
2120 2125 2130 



Val Tyr Gly Tyr Leu Arg Ala Thr Gly Glu Ser Arg Phe Glu Leu 
2135 2140 2145 



20 Ala Asp Lys Pro Pro Asp Asp Ala Arg Gly Ala Trp Asn Ala His 

2150 2155 2160 



Val His Glu Ala Ser Val Glu Ala Gly Glu Glu Ala Gin Arg Arg 
25 2165 2170 2175 



Leu Leu Asp Arg Cys Met Arg Ala Leu Pro Ala Val Leu Arg Gly 
2180 2185 2190 



Glu Arg Lys Ala Thr Glu Leu Leu Phe Pro Glu Gly Ser Met Ala 
2195 2200 2205 

35 

Trp Val Glu Gly lie Tyr Gin Asn Asn Pro Leu Ala Asp Tyr Phe 
2210 2215 2220 



40 Asn Ala Gin Leu Val Thr Arg Leu lie Ala Tyr Leu Arg Arg Arg 

2225 2230 2235 



V 



1 79 Application of Roycr, et al 



Leu Glu Ser Thr Pro Thr Ala Arg Leu Lys Leu Cys Glu lie Gly 
2240 2245 2250 



Ala Gly Ser Gly Gly Thr Thr Ala Ser Val Leu Gin Gin Leu Gin 
2255 2260 2265 



Ala Tyr Gly Glu His He Glu Glu Tyr Leu Tyr Thr Asp Leu Ser 
2270 2275 2280 



Pro Val Phe Leu His His Ala Glu Lys His Tyr Gin Pro Arg Ala 
2285 2290 2295 



Pro Tyr Leu Arg Thr Ala Cys Phe Asp Val Ala Arg Ala Pro Thr 
2300 2305 2310 



Ala Gin Ala Leu Glu Ser Gly Gly Tyr Asp Val Val He Ala Ala 
2315 2320 2325 



Asn Val Leu His Ala Thr Arg Asp He Ala Lys Thr Leu Arg Asn 
2330 2335 2340 



Ala Lys Ala Leu Leu Lys Pro Gly Gly Leu Leu Leu Leu Asn Glu 
2345 2350 2355 



Val He Glu Arg Ser Leu Val Leu His lieu Thr Phe Gly Leu Leu 
2360 2365 2370 



Glu Ser Trp Trp Leu Pro Gin Asp Lys He Leu Arg Leu Ala Gly 
2375 2380 2385 



I go Application of Royer, et al. 

Ser Pro Leu Leu Ala Cys Ala Thr Trp Arg Ser Leu Leu Glu Ala 
2390 2395 2400 



Glu Gly Phe Ala Gly Leu Ser Val His Arg Ala Gin Pro Asp Ala 
2405 2410 2415 



Gly Gin Ala lie lie Cys Ala Tyr Ser Asp Gly lie Val Arg Gin 
2420 2425 2430 



Ala Ser Thr lie Glu Val Ala Arg Asn Glu Lys Val Thr Val Pro 
2435 2440 2445 



Ser Gin Pro Ala Glu Ala Gly Glu Ser Pro Leu Asp Leu Val Lys 
2450 2455 2460 



Lys Leu Leu Gly Arg lie Leu Lys Met Asp Pro Ala Thr Leu Asp 
2465 ^ 2470 2475 



Thr Ser His Pro Leu Glu Tyr Tyr Gly Val Asp Ser lie Val Ala 
2480 2485 2490 



lie Glu Leu Ala Met Ala Leu Arg Glu Thr Phe Pro Gly Phe Glu 
2495 2500 2505 



Val Ser Glu Leu Phe Glu Thr Gin Ser lie Asp Thr Leu Leu Gly 
2510 2515 2520 



Ser Leu Glu Gin Ala Pro Leu Leu Ala Thr Leu Thr Ala Pro Pro 
2525 2530 2535 



I. 

Gin Gin Asp Met Leu Gin Gin Leu Lys Gin Leu Leu Ala Arg Thr 



10 



15 



30 



35 



1 g 1 ApplicaUon of Royer, et al. 



2540 2545 2550 



Leu Lys Leu Asp lie Thr Gin lie Asp Thr Ser Lys Thr Leu Glu 
5 2555 2560 2565 



Ser Tyr Gly Val Asp Sex lie Val He lie Glu Leu Ala Asn Ala 
2570 2575 2580 



Leu Arg Glu Arg 'Syr Pro Ser Leu Asp Ala Ser Gin Leu Met Glu 
2585 2590 2595 



Thr Leu Ser He Asp Arg Leu Val Ala Gin Trp Gin Ala Thr Glu 
2600 2605 2610 



20 Pro Ala Val Pro Ala Glu Pro Thr Ala Glu Pro Pro Val Ala Asp 

2615 2620 2625 



Glu Asp Ala Ala Ala He He Gly Leu Ala Gly Arg Phe Pro Gly 
25 2630 2635 2640 



Ala Asp Thr Leu Glu Glu Phe Trp Asn Asn Leu Arg Asn Gly Gin 
2645 2650 2655 



Ser Ser Met Gly Glu Val Pro Gly Glu Arg Trp Asp His Gin His 
2660 2665 2670 



Tyr Phe Asp Ser Glu Arg Gin Ala Pro Gly Lys Thr Tyr Ser Arg 
2675 2680 2685 



40 Trp Gly Ala Phe Leu Arg Asp He Asp Gly Phe Asp Ala Ala Phe 

2690 2695 2700 



5 



25 



X 82 Application of Royer, et al. 



Phe 61u Txp Pro Asp Ser Val Ala Leu Glu Ser Asp Pro Gin Ala 
2705 2710 2715 



Arg lie Phe Leu Glu Gin Ala Tyr Ala Gly lie Glu Asp Ala Gly 
2720 2725 2730 



10 Tyr Thr Pro Gly Ser Leu Ser Lys Ser Gin Arg Val Gly Val Phe 

2735 2740 2745 



Val Gly Val Met Asn Gly Tyr Tyr Ser Gly Gly Ala Arg Phe Trp 
15 2750 ' 2755 2760 



Gin He Ala Asn Arg Val Ser Tyr Gin Phe Asp Phe Arg Gly Pro 
2765 2770 2775 

20 

Ser Leu Ala Val Asp Thr Ala Cys Ser Ala Ser Leu Thr Ala He 
2780 ^ 2785 2790 



His Leu Ala Leu Glu Ser Leu Arg Ser Gly Ser Cys Glu Val Ala 
2795 2800 2805 



30 Leu Ala Gly Gly Val Asn Leu Leu Val Asp Pro Gin Gin Tyr Leu 

2810 2815 2820 



Asn Leu Ala Gly Ala Ala Met Leu Ser Ala Gly Ala Ser Cys Arg 
35 2825 2830 2835 



40 



Pro Phe Gly Glu Ala Ala Asp Gly Phe Val Ala Gly Glu Ala Cys 
2840 2845 2850 



1 33 Application of Royer, et al 

Gly Val Val Leu Iteu Lys Pro Leu Lys 61n Ala Arg Ala Asp Gly 
2855 2860 2865 



Asp Val lie His Ala Val lie Arg Gly Ser Met He Asn Ala Gly 
2870 2875 2880 



Gly His Thr Ser Ala Phe Ser Ser Pro Asn Pro Ala Ala Gin Ala 
2885 2890 2895 



Glu Val Val ?ixg Gin Ala Leu Gin Arg Ala Gly Val Ala Pro Asp 
2900 2905 2910 



Ser He Ser Tyr He Glu Ala His Gly Thr Gly Thr Val Leu Gly 
2915 2920 2925 



Asp Ala Val Glu Leu Gly Ala Leu Asn Lys Val Phe Asp Lys Arg 
2930 2935 2940 



Ala Ala Pro Cys Pro He Gly Ser Leu Lys Ala Asn He Gly His 
2945 2950 2955 



Ala Glu Ser Ala Ala Gly He Ala Gly Leu Ala Lys Leu Val Leu 
2960 2965 2970 



Gin Phe Arg His Gly Glu Leu Val Pro Ser Leu Asn Ala Phe Pro 
2975 2980 2985 



Leu Asn Pro Tyr He Glu Phe Gly Arg Phe Gin Val Gin Gin Gin 
2990 2995 3000 



Pro Ala Pro Trp Pro Arg Arg Gly Ala Gin Pro Arg Arg Ala Gly 



15 



|[g4 Application of Royer.etal 

3005 3010 3015 



Leu Ser Ala Phe Gly Ala Gly Gly Ser Asn Ala His Leu Val Val 
5 3020 3025 3030 



Glu Glu Ala Pro Ala Met Ala Pro Gly Val Ser lie Ser Ala Ser 
3035 3040 3045 

10 

Ser Pro Ala Leu lie Val Leu Ser Ala Arg Thr Leu Pro Ala Leu 
3050 3055 3060 



Gin Gin Arg Ala Arg Asp Leu Leu Val Trp Met Gin Ala Arg Gin 
3065 3070 3075 



20 Val Asp Asp Val Met Leu Ala Asp Val Ala Tyr Thr Leu His Leu 

3080 3085 3090 



Gly Arg Val Ala Met Glu Gin Arg Leu Ala Phe Thr Ala Gly Ser 
25 3095 3100 3105 



Ala Ala Glu Leu Ser Glu Lys Leu Gin Ala Tyr Leu Gly His Ala 
3110 3115 3120 



30 



35 



lie Arg Ala Asp lie Tyr Leu Ser Glu Asp Thr Pro Gly Lys Pro 
3125 3130 3135 



Ala Gly Ala Pro lie Val Ala Glu Glu Asp Leu Leu Thr Leu Met 
3140 3145 3150 



40 Asp Ala Trp lie Glu Lys Gly Gin Tyr Gly Arg Leu Leu Glu Tyr 

3155 3160 3165 



\ g5 Application of Royer, et 



Trp Thr Lys Gly Gin Pro lie Asp Trp Asn Lys Leu Tyr Trp Arg 
3170 3175 3180 



Lys Leu Tyr Ala Asp Gly Arg Pro Arg Arg He Ser Leu Pro Thr 
3185 3190 3195 



Tyr Pro Phe Glu His Arg Arg Tyr Trp Gin Thr Pro Val Pro Gly 
3200 -3205 3210 



Glu Arg Ser Leu His Ala Thr Ala Pro Ala Thr Arg Glu Thr Val 
3215 3220 3225 



Ala Val Gly Ala Met Pro Asp Pro Ala Gly Ala Thr Val Gin Ala 
3230 3235 3240 



Arg Leu Cys Ala Leu Cys Gin Val Leu Leu Gly Lys Pro Val Thr 
3245 3250 3255 



Ala Gin Met Asp Phe Phe Ala Val Gly Gly His Ser Val Leu Ala 
3260 3265 3270 



He Gin Leu Val Ser Arg He Arg Lys Ser Phe Gly Val Glu Tyr 
3275 3280 3285 



Pro Val Ser Ala Leu Phe Glu Ser Ala Leu Leu Ser Asp Met Ala 
3290 3295 3300 



Arg Gin He Glu Gin Leu Arg Val Asn Gly Val Ala Lys Arg Met 
3305 3310 3315 



15 



20 



J Appltcabon of Royer, et al 

Pro Ala Leu I*eu Pro Ala Gly Arg Val Gly Ala lie Pro Ala Thr 
3320 3325 3330 



5 Tyr Ala Gin Glu Arg Leu Trp Leu Val Hie Glu His Met Ser Glu 

3335 3340 3345 



Gin Arg Ser Ser Tyr Asn lie Thr Phe Ala Met His Phe Arg Gly 
10 3350 3355 3360 



Val Asp Phe Arg Ala Glu Ala Met Arg Ala Ala Leu Asn Ala Leu 
3365 3370 3375 



Val Val Arg His Glu Val Leu Arg Thr Arg Phe Leu Ser Glu Asp 
3380 3385 3390 



Gly Gin Leu Gin Gin Val lie Ala Ala Ser Leu Thr Leu Glu Val 
3395 3400 3405 



25 Pro Val Arg Glu Met Ser Val Glu Glu Val Asp Leu Leu Leu Ala 

3410 3415 3420 



Ala Ser Thr Arg Glu Thr Phe Asp Leu Arg Gin Gly Pro Leu Phe 
30 3425 3430 3435 



Lys Ala Arg He Leu Arg Val Ala Ala Asp His His Val Val Leu 
3440 3445 3450 

35 

Ser Ser He His His He He Ser Asp Gly Trp Ser Leu Gly Val 
3455 3460 3465 



40 



Phe Asn Arg Asp Leu His Gin Leu Tyr Glu Ala Cys Leu Arg Gly 



jj^ g'j Application of Royer, et al 

3470 3475 3480 



Thr Pro Pro Thr lieu Pro Thr Leu Ala Val Gin Tyr Ala Asp Tyx 
3485 3490 3495 



Ala Leu Trp Glri Arg Gin Trp Glu Leu Ala Ala Pro Leu Ser Tyr 
3500 3505 3510 



Trp Thr Arg Ala Leu Glu Gly Tyr Asp Asp Gly Leu Asp Leu Pro 
3515 3520 3525 



Tyr Asp Arg Pro Arg Gly Ala Thr Arg Ala Trp Arg Ala Gly Leu 
3530 3535 3540 



Val Lys His Arg Tyr Pro Pro Gin Leu Ala Gin Gin Leu Ala Ala 
3545 3550 3555 



Tyr Ser Gin Gin Tyr Gin Ala Thr Leu Phe Met Ser Leu Leu Ala 
3560 3565 3570 



Gly Leu Ala Leu Val Leu Gly Arg Tyr Ala Asp Arg Lys Asp Val 
3575 3580 3585 



Cys lie Gly Ala Thr Val Ser Gly 2Vrg Asp Gin Leu Glu Leu Glu 
3590 3595 3600 



Glu Leu lie Gly Phe Phe He Asn He Leu Pro Leu Arg Val Asp 
3605 3610 3615 



Leu Ser Gly Asp Pro Cys Leu Glu Glu Val Leu Leu Arg Thr Arg 
3620 3625 3630 



5 



25 



1 gg Application of Royer» et al 



Gin Val Val Leu Asp Gly Phe Ala His Gin Ser Val Pro Phe Glu 
3635 3640 3645 , 



His Val lieu Gin Ala Leu Arg Arg Gin Arg Asp Ser Ser Gin lie 
3650 3655 3660 



10 Pro Leu Val Pro Val Met Leu Arg His Gin Asn Phe Pro Thr Gin 

3665 3670 3675 



Glu He Gly Asp Trp Pro Glu Gly Val Arg Leu Thr Gin Met Glu 
IS 3680 3685 3690 



Leu Gly Leu T^p Arg Ser Thr Pro Ser Glu Leu Asp Trp Gin Phe 
3695 3700 3705 

20 

Tyr Gly Asp Gly Ser Ser Leu Glu Leu Thr Leu Glu Tyr Ala Gin 
3710 3715 3720 



Asp Leu Phe Asp Glu Ala Thr Val Arg Arg Met He Ala His His 
3725 3730 3735 



30 Gin Gin Ala Leu Glu Ala Met Val Ser Arg Pro Gin Leu Arg Val 

3740 3745 3750 



Gly Lys Trp Asp Met Leu Thr Ala Glu Glu Arg Arg Leu Phe Ala 
35 3755 3760 3765 



Ala Leu Asn Ala Thr Gly Thr Pro Arg Glu Trp Pro Ser Leu Ala 
3770 3775 3780 

40 
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1 39 Application of Royer, et al. 

Gin Gin Phe Glu Arg Gin Ala Gin Ala Thr Pro Gin Ala He Ala 
3785 3790 3795 



Cys Val Sex Asp Gly Gin Ser Trp Ser Tyr Ala Gin Leu Glu Ala 
3800 3805 3810 



Arg Ala Asn Gin Leu Ala Gin Ala Leu Arg Gly Gin Gly Ala Gly 
38X5 3820 3825 



Arg Asp Val Arg Val Ala Val Gin Ser Ala Arg Thr Pro Glu Leu 
3830 3835 3840 



Leu Met Ala Leu Leu Ala He Phe Lys Ala Gly Ala Cys Tyr Val 
3845 3850 3855 



Pro He Asp Pro Ala Tyr Pro Ala Ala Tyr Arg Glu Gin He Leu 
3860 3865 3870 



Ala Glu Val Gin Val Ser He Val Leu Glu Gin Asp Glu Leu Ala 
3875 3880 3885 



Leu Asp Glu Gin Gly Gin Phe His Asn Pro Arg Trp Arg Glu Gin 
3890 3895 3900 



Ala Pro Thr Pro Leu Gly Leu Arg Glu His Pro Gly Asp Leu Ala 
3905 3910 3915 



Cys Val Met Val Thr Ser Gly Ser Thr Gly Arg Pro Lys Gly Val 
3920 3925 3930 



Met Val Pro Tyr Ala Gin Leu Tyr Asn Trp Leu His Ala Gly Trp 



1 90 Application of Royer. et al 

3935 3940 3945 



Gin Arg Ser Pro Phe Olu Ala Gly Glu Arg Val Leu Gin Lys Thr 
3950 3955 3960 



Ser lie Ala Phe Ala Val Ser Val Lys Glu Leu Leu Ser Gly Leu 
3965 3970 3975 



Leu Ala Gly Val Glu Gin Val Met Leu Pro Asp Glu Gin Val Lys 
3980 3985 3990 



Asp Ser Leu Ala Leu Ala Arg Ala lie Glu Gin Trp Gin Val Thr 
3995 4000 4005 



Arg Leu Tyr Leu Val Pro Ser His Leu Gin Ala Leu Leu Asp Ala 
4010 4015 4020 



Thr Gin Gly Arg Asp Gly Leu Leu His Ser Leu Arg His Val Val 
4025 4030 4035 



Thr Ala Gly Glu Ala Leu Pro Ser Ala Val Arg Glu Thr Val Arg 
4040 4045 4050 



Ala Arg Leu Pro Gin Val Gin Leu Trp Asn Asn Tyr Gly Cys Thr 
4055 4060 4065 



Glu Leu Asn Asp Ala Thr Tyr His Arg Ser Asp Thr Val Ala Pro 
4070 4075 4080 



Gly Thr Phe Val Pro lie Gly Ala Pro lie Ala Asn Thr Glu Val 
4085 4090 4095 



igi Application of Royer, et al 



Tyr Val IjBU Asp Arg Gin Leu Arg Gin Val Pro lie Gly Val Met 
4100 4105 4110 



Gly Glu lieu His Val His Ser Val Gly Met Ala Arg Gly Tyr Trp 
41X5 4120 4125 



Asn Arg Pro Gly Leu Thr Ala Ser Arg Phe He Ala His Pro Tyr 
4130 4135 4140 



Ser Glu Glu Pro Gly Thr Arg Leu Tyr Lys Thr Gly Asp Met Val 
4145 4150 4155 



Arg Arg Leu Ala Asp Gly Thr Leu Glu Tyr Leu Gly Arg Gin Asp 
4160 4165 4170 



Phe Glu Val Lys Val Arg Gly His Arg Val Asp Thr Arg Gin Val 
4175 4180 4185 



Glu Ala Ala Leu Arg Ala Gin Pro Ala Val Ala Glu Ala Val Val 
4190 4195 4200 



Ser Gly His Arg Val Asp Gly Asp Met Gin Leu Val Ala Tyr Val 
4205 4210 4215 



Val Ala Arg Glu Gly Gin Ala Pro Ser Ala Gly Glu Leu Lys Gin 
4220 4225 4230 



Gin Leu Ser Ala Gin Leu Pro Thr Tyr Met Leu Pro Thr Val Tyr 
4235 4240 4245 



« .A 4J >C•^'U B. 



1 92 Application of Royer, et al. 

Gin Trp Leu Glu Gin Leu Pro Arg Leu Ser Asn Gly Lys Ijeu Asp 
4250 4255 4260 



Arg Leu Ala Leu Pro Ala Pro Gin Ala Val His Ala Gin Glu Tyr 
4265 4270 4275 



Val Ala Pro Arg Asn Gin Ala Glu Gin Arg Leu Ala Ala Leu Phe 
4280 4285 4290 



Ala Glu Val Leu Arg Val Glu Gin Val Gly lie His Asp Asn Phe 
4295 4300 4305 



Phe Ala Leu Gly Gly His Ser Leu Ser Ala Ser Gin Leu lie Ser 
4310 4315 4320 



Arg lie Ala Arg Asp Met Ala Xle Asp Leu Pro Leu Ala Met Leu 
4325 4330 4335 



Phe Glu Leu Pro Thr Val Ala Gin Leu Ser Glu Ser Leu Ala Ser 
4340 4345 4350 



His Ala Arg Asp Ser Asp Tyr Asp Val lie Pro Ala Ser Thr Glu 
4355 4360 4365 



Glu Ala Thr lie Pro Leu. Ser Thr Ala Gin Glu Arg Met Trp Phe 
4370 4375 4380 



Leu His Lys Phe Val Gin Glu Thr Pro Tyr Asn Thr Pro Gly Leu 
4385 4390 4395 



Ala Leu Leu Gin Gly Glu Leu Asp lie Ser Ala Leu Gin Val Ala 



1 93 Application of Royer, et at. 

4400 4405 4410 



Phe Arg Cys Val Leu Glu Arg His Ala Val Leu Arg Thr His Phe 
4415 4420 4425 



Val Glu Thr Glu Gin Gin Cys Val Gin Val lie Gly Ala Ala Glu 
4430 4435 4440 



Gin Phe Val Leu Gin Leu Arg Ser lie Arg Asp Glu Ala Asp Leu 
4445 4450 4455 



His Gly Leu Leu His Thr Ala Val Ser Glu Pro Phe Asp Leu Glu 
4460 . 4465 4470 



Arg Glu Leu Pro Leu Arg Ala Leu Leu Tyr Arg Leu Asp Asp Arg 
4475 4480 4485 



Arg His Tyr Leu Ala Val Val lie His His lie Val Phe Asp Gly 
4490 4495 4500 



Trp Ser Thr Ser lie Leu Phe Arg Glu Leu Ala Thr His Tyr Ala 
4505 4510 4515 



Ala Cys Arg His Gly Gin Ser Ala Pro Leu Pro Pro Leu Glu Leu 
4520 4525 4530 



Ser Tyr Ala Asp Tyr Ala Arg Trp Glu Arg Ala Arg Leu Asn Gin 
4535 4540 4545 



Glu Asp Ala Leu Arg Lys Leu Glu Tyr Trp Lys Thr Gin Leu Ala 
4550 4555 4560 




1 94 ' Application of Royer, et al. 



Asp Ala Pro Pro Leu Val Leu Pro Thr Thr Tyr Ala Arg Pro Val 
4565 4570 4575 



Phe Gin Asn Phe Asn Gly Ala Thr Val Ala Leu Gin lie Glu Pro 
4580 4585 4590 



Pro Leu Leu Gin Arg Leu Gin Arg Phe Ala Asp Ala His Ser Phe 
4595 4600 4605 



Thr Leu Tyr Met Leu Leu Leu Ala Ala Leu Gly Val Val Leu Ser 
4610 4615 4620 



Arg His Ala Arg Gin Lys His Phe Cys lie Gly Ser Pro Val Ala 
4625 * 4630 4635 



Asn Arg Ala Arg Ala Glu Leu His Gly Leu lie Gly Leu Phe Val 
4640 4645 4650 



Asn Thr Leu Ala Val Arg Leu Asp Leu Asp Gly Asn Pro Ser Val 
4655 4660 4665 



Arg Glu Leu Leu Glu Arg lie His Cys Thr Thr Leu Ala Ala Tyr 
4670 4675 4680 



Glu His Gin Asp Val Pro Phe Glu Arg lie Val Glu Ser Leu Lys 
4685 4690 4695 



Val Pro Arg Asp Thr Ala Arg Asn Pro Leu Gly Gin Val Met Leu 
4700 4705 4710 



1 95 ApplicaUon of Royer, et al 

Aan Phe Gin Asn Met Pro Met Ser Ala Phe Asp Leu Asp Gly Val 
4715 4720 4725 



Gin Val Gin Val Leu Pro Met His Asn Gly Thr Ala Lys Cys Glu 
4730 4735 4740 



Leu Thr Phe Asp Leu I*eu Leu Asp Gly Ser Arg Leu Ser Gly Phe 
4745 4750 4755 



Val Glu Tyr Ala Thr Gly Leu Phe Ala Pro Glu Trp Val Gin Ala 
4760 4765 4770 



Leu Val Gin Gin Phe Lys Cys Val Leu Ala Ala Leu Val Glu Arg 
4775 4780 4785 



Pro Glu Ala Ser Leu Asn Asp Leu Pro Met Ala Pro Asn Glu Ala 
4790 4795 4800 



Gin Pro Ala Ser Pro Ala Leu Met Lys His Val Ala Pro Ser Leu 
4805 4810 4815 



Pro Asn Leu Leu Glu Ala Met Ala Ala Asn Asp Ala Ala Arg Leu 
4820 4825 4830 



Ala Leu Gin Ala Pro Glu Gly Ala Leu Ser Tyr Ala Gin Leu He 
4835 4840 4845 



Glu Ala Ala Asn Glu Phe Ala Trp Arg Leu Arg Cys Glu His Ala 
4850 4855 4860 



Gly Pro Asp Lys Val Val Ala Leu Cys Leu Ala Pro Cys Ser Ala 



10 



15 



30 



35 



1 96 Applicahon of Royer, et al. 

4865 4870 4875 



Leu Val Val Ala Leu Leu Ala Ala Ser Leu Cys Gly Ala Ala Ser 
S 4880 4885 4890 



Val Leu lie Asp Pro Thr Thr Thr Ala Glu Ala Gin Tyr Asp Gin 
4895 4900 4905 



Leu Phe Glu Thr Arg Ala Gly He Val Val Thr Cys Ser Ser Leu 
4910 4915 4920 



Leu Glu Lys Leu Pro Leu Asp Asp Gin Ala Val Val Leu He Asp 
4925 4930 4935 



20 Glu Gin Ala Ala Glu Ala Thr Pro Arg Leu Met His Phe Thr Asp 

4940 4945 , 4950 



Asp Pro Ala Leu Pro Ala Met Leu Tyr cys Val Cys Asp Glu Lys 
25 4955 4960 4965 



Gly Arg Thr Arg Thr He Met Val Glu Ser Gly Ser Leu Ser Ser 
4970 4975 4980 



Arg Leu Leu T^p Ser Val Gin Arg Phe Ser Leu Glu Arg Thr Asp 
4985 4990 4995 



Arg Phe Leu Leu Arg Ser Pro Leu Ser Ala Glu Leu Ala Asn Thr 
5000 5005 5010 



\ 



40 



Glu Val Leu Gin Trp Leu Ala Ala Gly Gly Ser Leu Ser He Ala 
5015 5020 5025 



I 
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Pro Met His Gly Asp Phe Asp Ala Ala Ala Trp Leu Glu Thr Leu 
5030 5035 5040 

5 

Ala Thr Tyr Ala He Thr Val Ala Tyr Leu Ala Gin Val Glu Leu 
5045 5050 5055 



10 Thr Glu Met Leu Ala His Leu Gin Asn His Pro Leu Glu Arg Asn 

5060 5065 5070 



Lys Leu Ala Gly Leu Arg Val Leu Val Val His Gly Ala Pro Leu 
IS 5075 5080 5085 



Pro He Ala Pro Leu Met hrg Leu Asp Ala Trp Leu Arg Glu Val 
5090 5095 5100 

20 

Gly Gly Ser Ala Arg He Phe Ala Ala Tyr Gly Asn Ala Glu Phe 
5105 5110 5115 

25 

Gly Ala Glu He Leu Ser Gin Asp Val Ser Ala Ala Leu Gin Ala 
5120 5125 5130 



30 Gly He Gly Ala Gin Tyr Lys His Arg Arg Gly Leu Phe Pro Leu 

5135 5140 - 5145 



Gly Ala Asn Ser Met Cys His Val Val Gin Ser Asn Gly Arg He 
35 5150 5155 5160 



40 



Ala Pro Asp Gly Met Val Gly Glu Leu Trp He Thr Gin Pro Ala 
5165 5170 5175 



15 



20 



35 
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Cys Leu Tyr Lys Thr Asp Ala ■ Leu Val Arg Arg Leu Ala Asn Gly 
5180 5185 5190 



5 Gin Leu Glu Trp Leu Gly Ser Leu Asp Val Gin Ser Arg lie Asp 

5195 520O 5205 



Asp Pro Arg lie Asp Leu Cys Val Val Glu Ala Gin Leu Arg Leu 
10 5210 5215 5220 



Cys Glu Asp Val Gly Glu Ala Val Val Leu Tyr Glu Pro Leu Lys 
5225 5230 5235 



Arg Cys Leu Val Ala Tyr Leu Ser Ala Arg Ser Thr Ala Ala He 
S240 5245 5250 



Met Thr Asp Glu Thr Leu Ala Arg He Arg Gin Ala Leu Ser Glu 
5255 5260 5265 



25 Thr Leu Pro Asp Tyr Leu Leu Pro Ala He Trp Val Pro Leu Ala 

5270 5275 5280 



His Trp Pro Arg Leu Pro His Gly Arg Val Asp Leu Gly Ala Leu 
30 5285 5290 5295 



Pro Ala Pro Asp Phe Asp Leu Ala Arg His Glu Ser Tyr He Ala 
5300 5305 5310 



Pro Arg Thr Ala Val Glu Gin Ala Val Ala Glu He Trp Gin Arg 
5315 5320 5325 



40 



Val Leu Lys Arg Thr Gin Val Gly Val His Asp Asn Phe Phe Glu 
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5330 5335 5340 



Leu Gly 61y His Ser VaX Leu Ala lie Gin Leu Val Ser Gly Leu 
5345 5350 5355 



Arg Lys Ala Leu Ala He Glu Val Pro Val Thr Leu Val Phe Glu 
5360 5365 5370 



Ala Pro He Leu Gly Ala Leu Ala Arg Gin He Ala Pro Leu Leu 
5375 5380 5385 



Val Ser Glu Arg Arg Pro Arg Pro Pro Gly Leu Thr Arg Leu Glu 
5390 5395 5400 



His Thr Gly Pro He Pro Ala Ser Tyr Ala Gin Glu Arg Leu Trp 
5405 5410 5415 



Leu Val His Glu His Met Glu Glu Gin Arg Thr Ser Tyr Asn He 
5420 5425 5430 



Ser Asn Ala Ala His Phe He Gly Ala Ala Phe Ser Val Glu Ala 
5435 5440 5445 



Met Arg Ala Ala Leu Asn Ala Leu Val Ala Arg His Glu Val Leu 
5450 5455 5460 



Arg Thr Arg Phe Leu Ser Glu Asp Gly Gin Leu Gin Gin Val He 
5465 5470 5475 



Ala Ala Ser Leu Thr Leu Glu Val Pro Val Arg Glu Val Ser Ala 
5480 5485 5490 



5 



25 
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Glu Glu Val Asp Leu Leu Leu Ala Ala Ser Thr Arg Glu Thr Phe 
5495 5500 5505 



Asp Leu Arg Gin Gly Pro Leu Phe Lys Ala Arg He Leu Arg Val 
5510 5515 5520 



10 Ala Ala Asp His His Val Val Leu Ser Ser He His His He He 

5525 5530 5535 



Ser Asp Gly Tarp Ser Leu Gly Val Phe Asn Arg Asp Leu His Gin 
IS 5540 5545 5550 



Leu Tyr Glu Ala Cys Leu Arg Gly Thr Pro Pro Thr Leu Pro Thr 
5555 5560 5565 

20 

I 

Leu Ala Val Gin Tyr Ala Asp Tyr Ala Leu Trp Gin Arg Gin Trp 
5570 5575 5580 



Glu Leu Ala Ala Pro Leu Ser Tyr Trp Thr Arg Ala Leu Glu Gly 
5585 5590 5595 



30 Tyr Asp Asp Gly Leu Asp Leu Pro Tyr Asp Arg Pro Arg Gly Ala 

5600 5605 5610 



Thr Arg Ala Trp Arg Ala Gly Leu Val Lys His Arg Tyr Pro Pro 
35 5615 5620 5625 



40 



Gin Leu Ala Gin Gin Leu Ala Ala Tyr Ser Gin Gin Tyr Gin Ala 
5630 5635 5640 



• 



20 1 Application of Royer, et al. 

Thr Leu Phe Met Ser Leu lieu Ala Gly Leu Ala Leu Val Leu Gly 
5645 5650 5655 



Arg Tyr Ala Asp Arg Lys Asp Val Cys lie Gly Ala Thr Val Ser 
5660 5665 5670 



Gly Arg Asp Gin Leu Glu Leu Glu Glu Leu lie Gly Phe Phe lie 
5675 5680 56B5 



Asn lie Leu Pro Leu Arg Val Asp Leu Ser Gly Asp Pro Cys Leu 
5690 5695 5700 



Glu Glu Val Leu Leu Arg Thr Arg Gin Val Val Leu Asp Gly Phe 
5705 5710 5715 



Ala His Gin Ser Val Pro Phe Glu His Val Leu Gin Ala Leu Arg 
5720 5725 5730 



Arg Gin JVrg Asp Ser Ser Gin lie Pro Leu Val Pro Val Met Leu 
5735 5740 5745 



Arg His Gin Asn Phe Pro Thr Gin Glu He Gly Asp Trp Pro Glu 
5750 5755 5760 



Gly Val Arg Leu Thr Gin Met Glu Leu Gly Leu Asp Arg Ser Thr 
5765 5770 5775 



Pro Ser Glu Leu Asp Trp Gin Phe Tyr Gly Asp Gly Ser Ser Leu 
5780 5785 5790 



Glu Leu Thr Leu Glu Tyr Ala Gin Asp Leu Phe Asp Glu Ala Thr 
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5795 5800 5805 



VaX Axg Arg Met lie Ala His His Gin Gin Ala Leu Glu Ala Met 
5810 5815 5820 



Val Ser Arg Pro Qln Leu .Arg Val Gly Lys Trp Asp Met Leu Thr 
5825 5830 5835 



Ala Glu Glu Arg Arg Leu Phe Ala Ala Leu Asn Ala Thr Gly Thr 
5840 5845 5850 



Pro Arg Glu Trp Pro Ser Leu Ala Gin Gin Phe Glu Arg Gin Ala 
5855 5860 5865 



Gin Ala Thr Pro Gin Ala He Ala Cys Val Ser Asp Gly Gin Ser 
5870 5875 5880 



Tarp Ser Tyr Ala Gin Leu Glu Ala Arg Ala Asn Gin Leu Ala Gin 
5885 5890 5895 



Ala Leu Arg Gly Gin Gly Ala Gly Arg Asp Val Arg Val Ala Val 
5900 5905 5910 



Gin Ser Ala Arg Thr Pro Glu Leu Leu Met Ala Leu Leu Ala He 
5915 5920 5925 



Phe Lys Ala Gly Ala Cys Tyr Val Pro He Asp Pro Ala Tyr Pro 
5930 5935 5940 



Ala Ala Tyr Arg Glu Gin He Leu Ala Glu Val Gin Val Ser He 
5945 5950 5955 



5 



25 
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Val Leu Glu Gin 61y Glu Leu Ala Leu Asp Glu Gin Gly Gin Fhe 
5960 5965 5970 



Arg Asn Arg Arg Trp Arg Glu Gin Ala Pro Thr Pro Leu Gly Leu 
' 5975 5980 5985 



10 ~. Arg Gly His Fro Gly Asp Leu Ala Cys Val Met Val Thr Ser Gly 
5990 5995 6000 



Ser Thr Gly Arg Pro Lys Gly Val Met Val Pro Tyr Ala Gin Leu 
IS 6005 6010 6015 



His Asn Trp Leu His Ala Gly Trp Gin Arg Ser Ala Phe Glu Ala 
6020 6025 6030 

20 

Gly Glu Arg Val Leu Gin Lys Thr Ser lie Ala Phe Ala Val Ser 
6035 6040 6045 



Val Lys Glu Leu Leu Ser Gly Leu Leu Ala Gly Val Gly Gin Val 
6050 6055 6060 



30 Met Leu Pro Asp Glu Gin Val Lys Asp Ser Leu Ala Leu Ala Arg 

6065 6070 6075 



Ala lie Glu Gin Trp Gin Val Thr Arg Leu Tyr Leu Val Pro Ser 
35 6080 6085 6090 



His Leu Gin Ala Leu Leu Asp Ala Thr Gin Gly Arg Asp Gly Leu 
6095 6100 6105 



40 



20 



35 
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Leu His Ser Leu Arg His Val Val Thr Ala Gly Glu Ala Leu Pro 
6110 6115 6120 

I 

5 Ser Ala Val Gly Glu Ala Val Arg Val Arg Leu Pro Gin Val Gin 

6125 6130 6135 



Leu Trp Asn Asn Tyr Gly Cys Thr Glu Leu Asn Asp Ala Thr Tyr 
10 6140 6145 6150 



His Arg Ser TVsp Thr Val Ala Pro Gly Thr Phe Val Pro lie Gly 
6155 6160 6165 

15 

Ala Pro lie Ala Asn Thr Glu Val Tyr Val Leu Asp Arg Gin Leu 
6170 6175 6180 



Arg Gin Val Pro lie Gly Val Met Gly Glu Leu His Val His Ser 
6185 6190 6195 



25 Val Gly Met Ala Arg Gly Tyr Trp Asn Arg Pro Gly Leu Thr Ala 

6200 6205 6210 



Ser Arg Phe lie Ala His Pro Tyr Ser Glu Glu Pro Gly Thr Arg 
30 6215 6220 6225 



Leu Tyr Lys Thr Gly Asp Met Val Arg Arg Leu Ala Asp Gly Thr 
6230 6235 6240 



Leu Glu Tyr Leu Gly Arg Gin Asp Phe Glu Val Lys Val Arg Gly 
6245 6250 6255 



40 

His Arg Val Asp Thr Arg Gin Val Glu Ala Ala Leu Arg Ala Gin 
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6260 6265 6270 



Pro Ala Val Ala Glu Ala Val Val Ser Gly His Arg Val Asp Gly 
6275 6280 6285 



Asp Met Gin Leu Val Ala Tyr Val Val Ala Arg Glu Gly Gin Ala 
6290 6295 6300 



Pro Ser Ala Gly Glu Leu Lys Gin Gin Leu Ser Ala Gin Leu Pro 
6305 6310 63X5 



Thr Tyr Met Leu Pro Thr Val Tyr Gin Trp Leu Glu Gin Leu Pro 
6320 6325 6330 



Arg Leu Ser Asn Gly Lys Leu Asp Arg Leu Ala Leu Pro Ala Pro 
6335 6340 6345 



Gin Val Val His Ala Gin Glu Tyr Val Ala Pro Arg Asn Glu Ala 
6350 6355 6360 



Glu Gin Arg Leu Ala Ala Leu Phe Ala Glu Val Leu Arg Val Glu 
6365 6370 6375 



Gin Val Gly lie His Asp Asn Phe Phe Ala Leu Gly Gly His Ser 
6380 6385 6390 



Leu Ser Ala Ser Gin Leu He Ser Arg He Arg Gin Ser Phe His 
6395 6400 6405 



Val Asp Leu Pro Leu Ser Arg He Phe Glu Ala Pro Thr He Glu 
6410 6415 6420 



5 



20 



25 



40 
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Gly Leu Val Arg Gin Leu Ala Leu Pro Ser Glu Gly Gly Val Ala 
6425 6430 6435 



Ser lie Ala Arg Val Ala Arg Asn Arg Thr He Pro Leu Ser Leu 
6440 6445 6450 



10 Phe Gin Glu Arg Leu Trp Phe Val His Gin His Met Pro Glu Gin 

6455 6460 6465 



Arg Thr Ser Tyr Asn Gly Thr Leu Ala Leu Arg Leu Arg Gly Pro 
15 6470 6475 6480 



Leu Ser Val Glu Ala Met Arg Ala Ala Leu Arg Ala Leu Val Leu 
6485 6490 6495 



Arg His Glu He Leu Arg Thr Arg Phe Val Leu Pro Thr Gly Ala 
6500 6505 6510 



Ser Glu Pro Val Gin Val He Asp Glu His Ser Asp Phe Gin Leu 
6515 6520 6525 



30 Ser Val Gin Leu Val Glu Asp Thr Glu He Ala Ser Leu Met Asp 

6530 6535 6540 



Glu Leu Ala Ser His He Tyr Asp Leu Ala Asn Gly Pro Leu Phe 
35 6545 6550 6555 



He Ala Cys Leu Leu Gin Leu Asp Glu Gin Glu His Val Leu Leu 
6560 6565 6570 



I 



15 



20 



35 



40 
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He Gly Met His His Leu He Tyr Asp Ala Trp Ser Gin Phe Thr 
6575 6580 6585 



5 Val Met Asn Arg Asp Leu Arg Val Leu Tyr His Arg His Leu Gly 

6590 6595 6600 



Leu Ala Gly Gly Asp Leu Pro Glu Leu Pro He Gin Tyr Ala Asp 
10 6605 6610 6615 



Tyr Ala He Trp Gin Arg Ala Gin Asn Leu Asp Ala Gin Leu Ala 
6620 6625 6630 



Tyr Trp Gin Ala Met Leu His Asp Tyr Asp Asp Gly Leu Glu Leu 
6635 6640 6645 



Pro Tyr Asp Tyr Pro Arg Pro Arg Asn Arg Thr Trp His Ala Ala 
6650 6655 6660 



25 Val Tyr Thr His Thr Tyr Pro Ala Glu Leu Val Gin Arg Phe Ala 

6665 6670 6675 



Gly Phe Val Gin Ala His Gin Ser Thr Leu Phe He Gly Leu Leu 
30 6680 6685 6690 



Ala Ser Phe Ala Val Val Leu Asn Lys Tyr Thr Gly Arg Asp Asp 
6695 6700 6705 



Leu Cys He Gly Thr Thr Thr Ala Gly Arg Thr His Leu Glu Leu 
6710 6715 6720 



Glu Asn Leu He Gly Phe Phe He Asn He Leu Pro Leu Arg Leu 
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6725 6730 6735 



Arg Leu Asp Gly Asp Pro Asp Val Ala Glu lie Met Arg Arg Thr 
6740 6745 6750 



Arg Leu Val Ala Met Ser Ala Phe Glu Asn Gin Ala Leu Pro Phe 
6755 6760 6765 



Glu His Leu Leu Asn Ala Leu His Lys Gin Arg Asp Thr Ser Arg 
6770 6775 6780 



lie Pro Leu Val Pro Val Val Met Arg His Gin Asn Phe Pro Asp 
6785 6790 6795 



Thr lie Gly Asp Trp Ser Asp Gly lie Arg Thr Glu Val lie Gin 
6800 6805 6810 



Arg Asp Leu Arg Ala Thr Pro Asn Glu Met Asp Leu Gin Phe Phe 
6815 6820 6825 



Gly Asp Gly Thr Gly Leu Ser Val Thr Val Glu Tyr Ala Ala Glu 
6830 6835 6840 



Leu Phe Ser Glu Ala Thr lie Arg Arg Leu lie His His His Gin 
6845 6850 6855 



Leu Val Leu Glu Gin Met Leu Ala Ala His Glu Ser Ala Thr Cys 
6860 6865 6870 



Pro Leu Asp Val Ala Asp 
6875 
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<210> 27 

<211> 343 

<212> PRT 

<213> Xanthomonas albillneans 

<400> 27 



Met Asp Ser Ala Leu Pro Thr Ser Ala Phe Thr Phe Asp Leu Phe Tyr 

15 



^5 10 



Thr Thr Val Asn Ala Tyr Tyr Arg Thr Ala Ala Val Lya Ala Ala He 
20 25 30 



Glu Leu Gly Leu Phe Asp Val Val Gly Gin Gin Gly Arg Thr Pro Ala 
35 40 45 



Ala He Ala Glu Ala Cys Gin Ala Ser Pro Arg Gly He Arg He Leu 
50 55 60 



Cys Tyr Tyr Leu Val Ser He Gly Phe Leu Arg Arg Asn Gly Gly Leu 
^= 70 75 80 



Phe Tyr He Asp Arg Asn Met Ala Met Tyr Leu Asp Arg Ser Ser Pro 
85 90 95 



Gly Tyr Leu Gly Gly Ser He Lys Phe Leu Leu Ser Pro Tyr He Met 
lOO 105 



Ser Ala Phe Thr Asp Leu Thr Ala Val Val Arg Thr Gly Lys He Asn 
lis 120 125 



Leu Ala Gin Asp Gly Val val Ala Pro Asp His Pro Gin Trp Val Glu 
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130 135 140 



Phe Ala Arg Ala Met Ala Pro Met Met Ala Leu Pro Ser Ala Leu lie 
145 150 15S 160 



Ala Asn Met Val Ser Leu Pro Ala Asp Arg Pro lie Arg Val Leu Asp 
165 170 175 



Val Ala Ala Gly His Gly Leu Phe Gly lie Ala Phe Ala Gin Arg Phe 
180 185 190 



Arg Gin Ala Glu Val Ser Phe Leu Asp Trp Asp Asn Val Leu Asp Val 
195 200 205 



Ala Arg Glu Asn Ala Gin Ala Ala Lys Val Ala Glu Arg Ala Arg Phe 
210 215 220 



Leu Pro Gly Asn Ala Phe Asp Leu Asp Tyr Gly Ser Gly Tyr Asp Val 
225 230 235 240 



lie Leu Leu Thr Asn Phe Leu His His Phe Asp Glu Val Asp Gly Glu 
245 250 255 



Arg lie Leu Ala Lys Thr Arg Asp Ala Leu Asn Asp Asp Gly Met Val 
260 265 270 



lie Thr Phe Glu Phe lie Ala Asp Glu Glu Arg Ser Ser Pro Pro Leu 
275 280 285 



Ala Ala Thr Phe Ser Met Met Met Leu Gly Thr Thr Pro Ala Gly Glu 
290 295 300 
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Ser Tyr Thr Tyr Ser Asp Leu Glu Arg Met Phe Arg His Ala Gly Phe 

310 315 320 



Gly His Val Glu Leu Lys Ser He Pro Pro Ala Leu Leu Lys Val Val 
325 330 335 



Val Ser Arg Lye Arg Ala Pro 
340 



<210> 28 
<211> 167 
<212> PRT 

<213> Xemthomonas albilineans 
<400> 28 

Met He Glu Ser Ala Thr Ser Pro Val Ala Lys Thr Glu Arg He Trp 
15 10 15 



Cys Thr Glu Leu Asp Leu Asp Ala Leu Asn Ala Met Ser Ala Asn Thr 
20 25 30 



Met Gin Ala Leu Leu Gly He Arg Met He Glu He Gly Ser Asp Tyr 
35 40 45 



Leu Val Ser Cys Met Ser Val Asp Trp Arg Cys His Gin Pro Tyr Gly 
50 55 60 



Val Leu His Gly Gly Ala Ser Val Thr Leu Ala Glu Ala Thr Gly Ser 
S5 70 75 80 



Met Ala Ala Ser Met Cys Val Pro Ala Gly Gin Arg Cys Val Gly Leu 



15 



30 
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85 90 95 



Asp lie Asn Ala Asn His lie Ala Ser lie Ser Ser Gly Gin Val Gin 
5 100 105 110 



Cys lie Ala Arcf Pro lieu His He Gly Ala Leu Thr Gin Val Trp Gin 
1X5 120 125 

10 

Met Arg He Tyr Asp Glu Gly Asp Arg Thr He Cys Val Ser Arg Leu 
130 135 140 



Thr Met Ala Val Leu Ser Val His Val Ala Arg Val Ser Pro Asn Pro 
1^5 150 155 160 



20 Ala Ser Ser Gly Val Gin Thr 

165 



<210> 29 

25 <211> 941 

<212> PRT 

<213> Xanthomonas albilineans 



<400> 29 

Met Asn Glu Thr Ala Thr Val Thr Lys Ala Thr Leu Ser Ser Ala Lys 
15 10 15 



35 Ala Ser He Thr Pro Ala Cys Val His Gin Trp Phe Glu Ala Gin Val 

20 25 30 



Ser Ser Thr Pro Asp Ala Pro Ala Ala Phe Leu Gly Glu Arg Arg Met 
40 35 40 45 
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Ser Tyr Gly Gin lieu Asn Thr Arg Ala Asn Arg Leu Ala Arg Leu Leu 
50 55 60 



Gin Ser Gin Gly Val Gly Pro Gly Ala Arg Val Ala Val Trp Met Asn 
65 70 75 80 



Arg Ser Pro Glu Cys Leu Ala Ala Leu Leu Ala Val Met Lys Ala Gly 
85 90 95 



Ala Ala Tyr Val Pro lie Asp Leu Ser Leu Pro lie Arg Arg Val Gin 
100 105 110 



Tyr lie Leu Gin Asp Ser Gin Ala Arg Leu Val Leu Val Asp Asp Glu 
115 120 125 



Gly Gin Gly Arg Leu Asp Glu Leu Glu Leu Gly Ala Met Thr Ala Val 
130 135 140 



Asp Val Cys Gly Thr Leu Asp Gly Asp Glu Ala Asn Leu Asp Leu Pro 
145 150 155 160 



Cys Asp Pro Ala Gin Pro Val Tyr Cys lie Tyr Thr Ser Gly Ser Thr 
165 170 175 



Gly Ser Pro Lys Gly Val Leu Val Arg His Ser Gly Leu Ala Asn Tyr 
180 185 190 



Val Ala Trp Ala Lys Arg Gin Tyr Val Thr Ala Asp Thr Thr Ser Phe 
195 200 205 



Ala Phe Tyr Ser Ser Leu Ser Phe ^p Leu Thr Val Thr Ser lie Tyr 



15 



30 



35 
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210 215 220 



Val Pro Leu Val Ala Gly Leu Cys Val His Val Tyr Pro Glu Gin Qly 
^ 225 230 235 240 



Asp Asp Val Pro Val lie Asn Arg Val Leu Asp Asp Aan Gin Val Asp 
245 250 255 

10 

val He Lys Leu Thr Pro Ser His Met Leu Met Leu Arg Asn Ala Ala 
260 265 270 



Leu Ala Thr Ser Arg Leu Lys Thr Leu He Val Gly Gly Glu Asp Leu 
275 280 285 



20 Lys Ala Ala Val Ala Tyr Asp He His Gin Arg Phe Arg Arg Asp Val 

290 295 300 



Ala He Tyr Asn Glu Tyr Gly Pro Thr Glu Thr Val Val Gly Cys Ala 
25 305 310 315 320 



He His Arg Tyr Asp Pro Ala Thr Glu Arg Glu Gly Ser Val Pro He 
325 330 335 



Gly Val Pro He Asp His Thr Ser Leu His Leu Leu Asp Glu Arg Leu 
340 345 350 



Gin Pro Val Ala Pro Gly Glu Val Gly Gin He His He Gly Gly Ala 
355 360 365 



40 Gly Val Ala He Gly Tyr Val Asn Lys Pro Glu He Thr Asp Ala Gin 

370 375 380 



5 



20 



25 
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Phe lie Asp Asn Pro Phe Glu Gly Ser Gly Arg Leu Tyr Ala Ser Gly 
385 390 395 4OO 



Asp Leu Gly Arg Met Arg Ala Asp Gly Lys Leu Glu Phe Leu Gly Arg 
405 410 415 



10 Lys Asp Ser Gin lie Lys Leu Arg Gly Tyr Arg lie Glu Leu Gly Glu 

420 425 430 



lie Glu Asn Val Leu Leu Gly His Ala Ala Leu Arg Glu Cys He Val 
15 435 440 445 



Asp Thr Thr Val Ala Pro Arg Arg Asp Tyr Asp Ser Lys Ser Leu Arg 
450 455 460 



Tyr Cys Ala Arg Cys Gly He Ala Ser Asn Phe Pro Asn Thr Ser Phe 
465 470 475 480 



Asp Glu His Gly Val Cys Asn His Cys His Ala Tyr Asp Lys Tyr Arg 
485 490 495 



30 Asn Val Val Glu Asp Tyr Phe Arg Thr Glu Asp Glu Leu Arg Thr He 

500 505 510 



Phe Glu Gin Val Lys Ala His Asn Arg Leu Arg Tyr Asp Cys Leu Val 
35 515 520 525 



Ala Phe Ser Gly Gly Lys Asp Ser Thr Tyr Ala Leu Cys Arg Val Val 
530 535 540 



15 



20 
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Asp Met Gly Leu Arg Val Leu Ala Tyr Thr Leu Asp Asn Gly Tyr lie 
545 550 555 560 



5 Ser Asp Glu Ala Lys Ala Asn Val Asp Arg Val Val Arg Glu Leu Gly 

565 570 575 



Val Asp His Arg Tyr Leu Gly Thr Pro His Met Asn Ala lie Phe Val 
10 580 585 590 



Asp Ser Leu His Arg His Ser Asn Val Cys Asn Gly Cys Phe Lys Thr 
595 600 605 



lie Tyr Thr Leu Gly lie Asn Leu Ala His Glu Val Gly Val Ser Asp 
610 615 620 



He Val Met Gly Leu Ser Lys Gly Gin Leu Phe Glu Thr Arg Leu Ser 
625 630 635 640 



25 Glu Leu Phe Arg Ala Ser Thr Phe Asp Asn Gin Val Phe Glu Lys Asn 

645 650 655 



Leu Met Glu Ala Arg Lys He Tyr His Arg He Asp Asp Ala Ala Ala 
30 660 665 670 



Arg Leu Leu Asp Thr Ser Cys Val Arg Asn Asp Arg Leu Leu Glu Ser 
675 680 665 

35 

Thr Arg Phe He Asp Phe Tyr Arg Tyr Cys Ser Val Ser Arg Lys Asp 
690 695 700 



40 

Met Tyr Arg Tyr He Ala Glu Arg Val Gly Trp Ser Arg Pro Ala Asp 
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705 710 715 720 



Thr Gly Arg Ser Thr Asn Cys Leu Leu Asn Asp Val Gly lie Tyr Met 
725 730 735 



His Lys Lys Gin Arg Gly Tyr His Asn Tyr Ser Leu Pro Tyr Ser Trp 
740 745 750 



Asp Val Arg Val Gly His lie Pro Arg Glu Asp Ala Met Arg Glu Leu 
755 760 765 



Glu Asp Thr Asp Asp lie Asp Glu Ala Lys Val Leu Gly Leu Leu Lys 
770 775 780 



Gin lie Gly Tyr Asp Ser Ser Leu lie Asp Thr Gin Ala Gly Asp Ala 
785 790 795 800 



Gin Leu He Ala Tyr Tyr Val Ala Ala Glu Glu Leu Asp Pro Val Ala 
805 810 815 



Leu Arg Asn Phe Ala Ala Ala He Leu Pro Glu Tyr Met Leu Pro Ser 
820 825 830 



Tyr Phe Val Arg Leu Asp Arg Met Pro Leu Thr Pro Asn Gly Lys Val 
835 840 845 



Asn Arg Arg Ala Leu Pro Arg Pro Glu Leu Lys Lys Asn Ala Ser Glu 
850 855 860 



Ala His Thr Glu Pro Ser Ser Ala Leu Glu Gin Glu Leu Val Gin He 
865 870 875 880 



94 
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Trp Lys Glu Val Leu Met Val Asp Lys Val Gly Val Arg Asp Asn Phe 
885 890 895 



Phe Glu Leu Gly Gly His Ser Leu Ser Ala Leu Met Leu Leu Tyr Ser 
900 905 910 



He Ala Glu Arg Tyr Gin Lys Met Val Ser He Gin Ala Phe Ser Val 
915 920 925 



Asn Pro Thr He Glu Gly Leu Ser Glu His Leu Val Ala 
930 935 940 



<210> 30 
<211> 239 
<212> PRT 

c213> Xanthomonas albilineans 
<400> 30 

Met Asp Leu Gin Cys Ala Arg He Ala Ala Leu Cys Glu Gin Leu Lys 
^5 10 15 



Leu Ala Arg Leu Ser Ser Asp Trp Gin Ala Leu Ala Gin Ala Ala Ala 
20 25 30 



Cys Glu Asp Ala Ser Tyr Phe Leu Glu Lys Val Leu Ala Ser Glu Gin 
35 40 45 



Leu Ala Arg Glu Glu Arg Lys Arg Thr Val Leu Thr Arg Leu Ala Arg 
50 55 60 



Met Pro Ser He Lys Thr Leu Glu Gin Phe Asp Trp Ala Gin Ala Gly 



5 



10 



IS 



25 
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65 70 75 



80 



Gly Ala Ser Lys Ala Sin lie Val Glu Leu Gly His Leu Thr Phe Val 
85 90 35 



Glu Arg Ala Glu Asn Val Val Met Leu Gly Pro Ser Gly Val Gly Lys 

105 no 



Thr His He Ala Leu Ala Leu Cys Gin Arg Ala Val Met Ala Gly His 

X20 125 



Lys Ala Arg Phe He Thr Ala Ala Asp Leu Met Met Gin Leu Ala Ala 
130 135 



20 Val Lys Ala Gin Asn Arg Leu Lys Asp Tyr Phe Asn Arg Ala Val Leu 



145 150 155 



160 



Gly Pro Lys Leu Leu Val Val Asp Glu He Gly Tyr Leu Pro Phe Gly 
165 170 



Arg Glu Pro Ala Gin Gly Cys Trp Ala Ala Thr Gly Phe Ala Leu Arg 
3Q "° 18S 190 

Ser Leu Ala Ala Arg Arg Trp Lys Thr Pro Gly Gly Ser Asp Leu Leu 
195 200 205 



35 



Arg Arg Phe Lys Gly Lys Trp Val Lys Phe Lys Ser Ala Leu Thr Ala 
210 215 220 



40 



Asp Val val Tyr Leu He Phe Arg Leu Arg Gly Ser Asp His Pro 
225 230 235 
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<210> 31 

<211> 286 

<212> PRT 

<213> Xanthomonas albilineans 

<400> 31 



Met Pro Arg lie GIu Tyr Cys He Ser Met Met His Arg Arg Lys Pro 
^ S 10 IS 



Thr Thr Ash Arg Ser Val Cys Met Arg Asp He Glu Arg Thr Ala 



20 25 



lieu 
30 



Trp val Ala Gly Met Arg Ala Leu Glu Ser Glu Arg Glu Gin Ala Leu 

45 



3S 40 



Phe His Aap Pro Phe Ala Arg Arg Leu Ala Gly Asp Glu Phe Val Glu 

60 



50 55 



Glu Leu Arg Arg Asn Aan Gin Asn Val Pro Met Pro Pro Ala He Glu 
" 75 80 



Val Arg Thr Arg Trp Leu Asp Asp Lys He Met Gin Ala Val Ser Glu 
85 90 35 



Gly lie Gly Gin Val Val He Leu Ala Ala Gly Met Asp Ala Arg Ala 

105 



Tyr Arg Leu Pro Trp Pro Ser Asp Thr Arg Val Tyr Glu He Asp His 
115 120 125 



Asp val Leu ser Asp Lys His Glu Lys Leu His Asp Ala Gin 



22 1 . Applicahon of Roycr, et al. 

130 135 140 



Val Cys Gin Arg lie Ala Leu Pro lie Asp Leu Arg Glu Asp Trp Pro 
145 150 155 160 



Gin Ala Leu Lys Glu Ser Gly Phe Val Gly Ser Ala Ala Thr Leu Trp 
165 170 175 



Leu Val Glu Gly Leu Leu Cys Tyr Leu Ser Ala Glu Ala Val Met Leu 
180 185 190 



Leu Phe Ala Arg lie Asp Ala Leu Ser Ala Lys Gly Ser Ser Val Leu 
195 200 205 



Phe Asp Val lie Gly Leu Ser Met Leu Asn Ser Pro Asn Ala Arg Val 
210 215 220 



Leu His Ala Met Ala Arg Gin Phe Gly Thr Asp Glu Pro Glu Ser Leu 
225 230 235 240 



He Gin Pro Leu Gly Trp Glu Pro Gly Val Leu Thr He Ala Ala Ala 
245 250 255 



Gly Gin Gin Met Gly Arg Trp Pro Phe Pro Val Ala Pro Arg Gly Thr 
260 265 270 



His Gly Val Pro Gin Ser Tyr Leu Val His Ala Leu Lys Arg 
275 280 285 



<210> 32 
<211> 765 
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<212> PRT 

<213> Xanthomonas albilineans 
<400> 32 

Met Arg Arg Ser Pro Tyr Pro Arg Thr Leu Met Asp Ser Pro Leu Thr 
X 5 10 15 



Asn Leu Pro Met His Ser Gly Thr Glu Leu Asp Leu Arg Trp Ser Val 
20 25 30 



Gly Gin Thr Arg Pro Gly Arg Asn Glu Ala Tyr Ala Arg Gin Trp Thr 
35 40 45 



Thr Leu Leu His Gin Trp Arg Arg Asp Tyr Pro Gly Leu Arg lie Asp 
50 55 60 



Val Ser Asp Thr Pro lie Gly Gin His lie Thr lie Asp Tyr Ala Ala 
65 70 75 80 



Pro Tyr Pro Cys Gly Ser Phe Gly Ser Leu Leu Arg Glu Tyr Ala Arg 
85 90 95 



Leu Gly Lys Leu Ala Gly Leu lie Cya Asp Tyr Leu Lys His Arg His 
100 105 110 



Gin lie Val Leu Ser Glu Ser Pro Pro Gly Ala Asn Thr Leu Ala Leu 
115 120 125 



Asp Leu Gly Arg lie Glu Glu Pro Lys Gin Leu Asp Arg Leu Gin Gly 
130 135 140 
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Ala Leu Gly Met Ala Leu Glu Ala Leu Ala Thr Arg Arg Ser Asp Gly 
145 150 155 160 



S Leu Leu Leu Trp His Ala Asp His Arg Gin Arg Asn Leu Pro Asp Leu 

165 170. 175 



Arg Asp Ser -Ala Val Cys Gly Ser Ala Ala Gin lie Ser Leu Pro Ala 
10 180 185 190 



Leu Ser Cys Val Glu Asp Leu He Glu Val Asp Thr Ser Leu Leu Ala 
195 200 205 



15 



20 



40 



Cys Asp His Gly Lys Leu Cys Gin He Ala Ser His Leu Pro Ala Ser 
210 215 220 



Trp Phe Ala Arg Ser Thr Asp Gly Pro Met Pro Ser Trp Ser Asp Ala 
225 230 235 240 



25 Ser Thr Ala Val Phe Ala Cys Ala Pro He Gly Phe Leu Pro Ser Val 

245 250 255 



Gin Val Asn Val Cys Ala Gin He Phe Ser Ala Ala His Leu Ala Ser 
30 260 265 270 



Thr Ala Gin Met He Asp Pro Leu Arg Gin Gin Ala Phe Ser Tyr Arg 
275 280 285 

35 

Gin Leu Arg Ser Arg Ala Ala Thr Tyr Ala Arg His Leu Ser Leu Leu 
290 295 300 



Gly Leu Gin Ser Gly Asp Ala Val Ala Leu He Ala He Asp Ser Leu 



305 
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^ Ala Gly val Ala Leu Met r^u Ala Cys Leu Ala Gly Gly Leu Val Phe 

325 330 335 



Ala Pro lie Asn Glu Leu Val Ser Leu Val His Phe Glu Thr Thr Leu 

340 34e 

10 " 350 



Lys Thr lie Lya Pro Arg Leu Val Leu He Asp Ala Glu Leu Pro Pro 

360 365 

15 

Ser His His Ala Ala Leu Arg His Leu Pro Thr Leu Glu Leu Thr Ser 



370 



375 



380 



20 Leu Met Pro Val He Glu Asn Asp Glu Leu Val Val Ala Pro Cys Ser 



25 



Ala Asp Ala Pro Ala Val Met He Cys Thr Ser Gly ser Thr Gly Thr 

410 4X5 

Pro Lys Ala Val Thr His Ser His Ala Asp Phe Met His Cys His Leu 
^20 425 .-^ 

30 '*30 



Asn Tyr Gin Gin Ala Val Leu Gly Leu Arg Ser Asp Asp Val Met Tyr 

35 



"5 440 445 



Thr Pro ser Arg Leu Phe Phe Ala Tyr Gly Leu Asn Asn Leu Met Leu 

455 460 

40 ser Leu Leu Ala Gly Val Ser His Val He Ala Ala Pro Leu Ser val 

^" 475 480 



I 
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TVrg Gin lie Ala Gin Thr lie His Thr Tyr His Val Thr Val Leu Leu 
4B5 490 495 



Ala Val Pro Ala Val Phe Lys Leu Leu Leu Ala Glu Ala Ala Pro Asp 
500 SOS 510 



Ala Val Trp Pro Ala Leu Arg Leu Cys lie Ser Ala Gly Glu Ser Leu 
SIS 520 525 



Pro Ala Arg Leu Gly His Ala lie Ser Thr Arg Trp Gin Val Glu Val 
530 535 540 



Leu Asp Gly lie Gly Cys Thr Glu Val Leu Ser Thr Phe lie Ser Asn 
545 550 555 560 



Arg Pro Gly His Ala Leu Met Gly Cys Thr Gly Thr Pro Val Pro Gly 
565 570 575 



Phe Val Val Lys Leu Val Asn Lys Gin Gly Glu lie Cys Arg lie Gly 
580 585 590 



Glu Val Gly Ser Leu Trp Val Arg Gly Asn Thr Leu Thr Arg Gly Tyr 
595 600 605 



Val Gly Asp Pro lie Leu Ser Ala Gin Leu Phe Val Asp Gly Trp Phe 
610 615 620 



Asp Thr Arg Asp Leu Phe Phe Ala Asp Ala Lys Gly Arg Phe His Asn 
625 630 635 640 
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Leu Gly Arg Met Gly Ser Ala lie Lys Zle Asn Gly Cys Txp Leu Ser 
645 650 655 



Pro Glu Thr Leu Glu Ser Val lie Gin Thr His Ala Cys Val Lys Glu 
660 665 670 



Cys Ala lie Cys Leu He Glu Asp Glu Phe Gly Leu Pro Arg Pro Ala 
10 675 680 685 



15 



Ala Phe Val Val Pro Val Asp Ala Ser He Asp Thr Gly Ala Leu Trp 
690 695 700 



Ala Ala Leu Arg Ala Leu Cys Lys Asn Ala Leu Gly Lys His His Tyr 
705 710 715 720 



20 



Pro His Leu Phe Val Glu Val Ser Thr He Pro Arg Thr Cys Ser Gly 
725 730 735 



25 Lys Val He Arg Pro Ala Leu Leu Glu Thr Leu Ala Ser Ala Lys His 

740 745 750 



Leu Gin Ser His Leu Phe Phe Val Gly His Ala Arg Thr 
30 755 760 765 



<210> 33 

<211> 330 

35 <212> PRT 

<213> Xanthomonas albilineans 

<400> 33 



40 Met: His Thr Asn Ala Asp Leu Pro Leu Thr He Lys Ala Asp Ser Ala 

15 10 15 



20 



25 



5 
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Glu Ala Thr.Leu Thr Asp Trp Asn Ala Thr His Arg Ala Thr Trp Pro 
20 25 30 



Thr Leu Leu Trp Gin His Arg Ala Leu Leu Phe Arg Gly Phe Ala His 
35 40 45 



10 Pro Gly Gly Leu Glu Gin lie Ser Arg Cys Phe Phe Asp Glu Arg Leu 

50 55 60 



Ala Tyr Thr Tyr Arg Ser Thr Pro Arg Thr Asp Val Gly Gin His Val 
15 65 70 75 80 



Tyr Thr Ala Thr Glu Tyr Pro Arg Gin Leu Ser lie Ala Gin His Cys 
85 90 95 



Glu Asn Ala Tyr Gin Arg Val Trp Pro Met Lys Leu Leu Phe His Cys 
100 105 110 



Val Gin Pro Ala Ser Glu Gly Gly Cys Thr Pro Leu Ala Asp Met Leu 
115 120 125 



30 Lys Val Thr Ala Ala He Asp Pro Gin Val Arg Glu He Phe Ala Arg 

130 135 140 



Lys Gin Val Arg Tyr Val Arg Asn Tyr Arg Ala Gly Val Asp Leu Pro 
35 145 150 " 155 160 



40 



Trp Glu Asp Val Phe Asn Thr Arg Asn Lys Gin Glu Val Glu Ala Tyr 
165 . 170 175 



\ 
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Cys Ala Arg Asn Asp Met Gin Cys Glu Trp Thr Gly Asp Gly Leu Arg 
180 185 190 



Thr Ser Gin lie Cys Arg Ala Phe Ala Cys His Pro Ala Thr Gly Asp 
195 200 205 



Glu Val Trp Phe Asn Gin Ala His Leu Phe His Tyr Thr Ala Leu Glu 
210 215 220 



Ala Ala Ala Gin Lys Met Met Leu Ser Phe Phe Gly Glu Gin Gly Leu 
225 230 235 240 



^ Pro Arg Asn Ala Tyr Phe Gly Asp Gly Thr Pro lie Asp Pro Ala Met 
245 250 255 



Leu Asp His Val Arg Thr Val Phe Ala Gin His Lys He His Phe Asp 
260 265 270 



Trp His Arg Asp Asp Val Leu Leu He Asp Asn Met Leu Val Ser His 
275 280 285 



Gly Arg Glu Pro Tyr Glu Gly Ser Arg Lys He Leu Val Cys Met Ala 
290 295 300 



Glu Pro Tyr Ser Pro Glu Gin Ser Ser Pro Asp He Ala Ala Arg Ser 
305 310 315 320 



Asp Gly Glu Ala Met. Leu Lys Leu His Val 
325 330 



<210> 34 



10 



20 



25 



30 
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<211> 1959 
<212> PRT 

<2X3> Xanthomonas alblllneans 



5 <400> 34 



Met Lys Leu Ser Ser Met Ser heu heu Asp Ala Glu Asp Val Ala Leu 
^5 10 15 



Thr Ala Ala Ser Pro Asp Thr Ala Leu Ala Leu Asp Trp Ser,Arg Ser 
20 25 30 



15 Val Leu Asp Leu Phe Asp Ala Gin Val Ala Leu His Ala Glu Glu Leu 

35 40 45 



Ala Cys Ala Asp Gin His Arg Gin Leu Ser Tyr Ala Gin Leu Asp Gin 
50 55 60 



His Ala Asn Arg Leu Ala His Cys Leu He Glu Arg Gly Leu Arg Pro 
" 70 75 80 



Gin Glu Arg Val Ala Leu Trp Phe Gly Arg Ser Pro Asp Phe Leu He 
85 90 95 



Ala Leu Leu Gly Val Leu Lys Ala Gly Gly Cys Tyr Val Pro Leu Asp 
100 105 110 



35 Pro His Tyr Pro Thr Thr Tyr He Gin Gin He Leu Asp Asp Ala Gin 

115 120 125 



Pro Arg Leu Leu Leu Cys Gly Lys Asp He Asp Gly Gin Leu He Gin 
40 130 135 140 



o 3 ^ JL a .1. s o e 



23 0 A{ip!ication of Royer, et al 

val Pro Arg Leu Arg Leu Asp Asp Ala Ala He Ala Arg Gin Pro His 
145 150 155 160 

5 Thr Pro Leu Pro His Ala Leu His Pro Ala Gin Leu Ala Tyr Val Met 

165 170 175 



10 



IS 



20 



35 



Tyr Thr Ser Gly Ser Thr Gly Arg Pro Lys Gly Val Met Val Pro His 
180 185 190 



Arg Gin He Leu Asn Trp Leu His Ala Leu Trp Ala Arg Ala Pro Phe 
195 200 205 



Glu Ala Gly Glu Arg Val Ala Gin Lys Thr Ser He Ala Phe Ala He 
210 215 220 



Ser Val Lys Glu Leu Leu Ala Gly Leu Leu Ala Gly Val Pro Gin Val 
225 230 235 240 



25 Phe He Asp Glu Asp Thr Val Arg Asp He Pro Ala Phe Val Arg Ala 

245 250 255 



Leu Glu Thr Trp Gin He Thr Arg Leu Tyr Thr Phe Pro Ser Gin Leu 
30 260 265 270 



Asn Ala Leu Leu Asp His Val Ala Glu Thr Pro Gin Arg Leu Ala Arg 
275 280 285 



Leu Arg Gin Leu Phe Val Ser He Glu Pro Cys Pro Ala Glu Leu Leu 
290 295 300 



40 

Gin Arg Leu Arg Thr Leu Leu Pro Ala Cys Thr Ala Trp Tyr He Tyr 



• 



1&0 
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3" 315 320 



Gly cys Thr Glu lie Asn Asp Met Thr Tyr Cys Asp Pro Ala Glu Gin 
5 32S 330 335 



10 



His Ser Gly Ser Gly Phe Val Pro Val Gly Arg Pro lie Ala Asn Thr 
340 345 



350 



15 



Lys Val His Val Leu Asp Glu Gin Leu Arg Pro Leu Pro Pro Gly lie 
355 360 



365 



Met Gly Glu Val His He Glu Ser Leu Gly He Thr His Gly Tyr Tip 
370 375 380 



20 Arg Gin Gly Gly Leu THr Ala Ala Arg Phe He Ala Asn Pro Tyr Gly 

390 395 400 



Pro Pro Gly Ser Arg Leu Tyr Arg Thr Gly Asp Met Ala Arg Leu Leu 
405 410 415 



30 



Asp Asn Gly Thr Leu Glu Leu Leu Gly Arg Arg Asp Tyr Glu Val Lys 
420 425 430 



35 



Val Arg Gly Tyr Arg Val Asp Val Arg Gin Val Glu Lys Ala Leu Ala 
435 440 445 



Ala Hie Leu Gin Val Ala Glu Ala Ala Val He Gly Trp Pro Gin Gly 
4S0 455 



40 Ser Pro Thr Pro Glu Leu Leu Ala Tyr Val Val Pro Arg Gin Gly Val 

465 470 475 



480 



23 2 Application of Royer, et al. 



Leu Asn Leu Asp Glu Leu Arg Lys Leu Leu Gin Glu Arg Leu Pro Thr 
485 490 495 



Tyr Met Leu Pro Thr Arg Phe Gin Ser Leu Pro Ala Leu Pro Arg Leu 
500 505 510 



Pro Asn Gly Lys Leu Asp Thr Leu Ser Leu Pro Glu Pro Gin Ala Ala 
515 520 525 



Ser Ser Asp Ser Asp Tyr Leu Ala Pro Arg Ser Glu Val Glu He Thr 
530 535 540 



Leu Ala Lys Leu Trp Ser Glu Leu Leu Thr Pro Ala Gin Ala Ala Pro 
545 550 555 560 



Leu Arg Val Ser Leu Asn Asp Asn Phe Phe Asn Leu Gly Gly His Ser 
565 570 575 



Leu Leu Ala Thr Gin Leu Phe Ser Arg He Arg Gin Ser Phe Asp He 
580 585 590 



Glu Val Arg Val Asn Thr Leu Phe Glu Ser Pro Val Leu Glu Asp Phe 
595 600 605 



Ala Arg Val Val Asn Glu Ala Arg Gin Gin Gin Ala Pro Thr Gly Gly 
610 615 620 



Asn Thr He Ser Ser Arg Ala Val Arg Asp Ala Pro Val Pro Leu Ser 
625 630 635 640 



IS 



20 
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Tyr Gin Gin Qlu Arg Leu Trp Phe Val His Glu His Met Pro Glu Gin 
645 650 655 



5 Arg Thr Ser Tyr Asn Val Ala Phe Ala Cys His Leu Arg Ser Ala Asp 

660 665 670 



Phe Ser Met Ser Ala Leu Arg Glu /aa He Gin Ala Leu Val Ala Arg 
10 675 680 685 



His Glu Thr Leu Arg Thr Arg He Ala Thr Cys Ala Gly Gly Asp Tyr 
690 695 700 



Pro Ser Gin His He Ala Asp Ala Met Gin Val Pro Val Pro Cys lie 
705 710 715 720 



Thr Ala Thr Pro Ala Glu Val Pro Arg Leu Val Ala Glu His Ala Ala 
725 730 735 



25 His Val Phe Asp Leu Ala His Gly Pro Leu Leu Lys Val Ser Val Leu 

740 745 750 



Arg Val Ser Asp Asp Tyr His Val Phe Leu Met Asn Met His His He 
30 755 760 765 



He Cys Asp Gly Trp Ser He Asn Leu He Phe His Asp Leu Arg Ala 
770 775 780 

35 

Phe Tyr He Ala Ala Leu Gin Gin Thr Pro Pro Ala Leu Pro Pro Leu 
785 790 795 800 



40 

Leu Leu Gin Tyr Ala Asp Tyr Ala Thr Trp Gin Arg Val Gin Asp Phe 
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805 810 815 



Ser Ala Asp Leu Asp Tyr Trp hya Gin Arg Leu His Gly Tyr Glu Glu 
820 825 830 



Gly Leu Ala Leu Pro Tyr Asp Phe Pro Arg Pro Ala Asn Arg Ala Trp 
835 840 845 



Arg Ala Gly He Leu His Leu Thr Tyr Pro Asp Ala Leu Ala Ala Arg 
850 855 860 



Leu Ala Ala Phe Ser Gin Glu Arg Arg Val Thr Leu Phe Met Thr Leu 

870 875 880 



Met Ala Ser Leu Ala He Val Leu His Gin Tyr Thr Gly Arg Arg Glu 
885 890 895 



Leu Cys Leu Gly Thr Thr Ser Ala Gly Arg Asp Gin Leu Glu Thr Glu 
900 905 910 



Asn Leu He Gly Phe Phe Val Asn He Leu Ala Val Arg Leu Asn Leu 
915 920 925 



Gly Ser His Ala Phe Ala Glu Asp Phe Leu Gin His Val Arg Gin Gin 
930 935 



Val Leu Asp Ala Tyr Ala His Arg Ala Leu Pro Phe Glu His Val Leu 
. 950 955 960 



Ser Ala Leu Lys Lys Pro Arg Asp Ser Ser Gin He Pro Leu Val Pro 
965 970 975 
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lie Met Leu Arg His Gin Asn Phe Ala Thr Glu Gly Val 



Asn Ala Phe 



980 985 



Ala Gin lie Phe I*u Ser Ala Gin Met Glu Phe Gly Glu Arg Thr Thr 

1000 1005 



Pro Asn Glu Leu Asp Leu Gin Phe He Gly Asp Gly Ser Hi 



1010 1015 



His Leu 
1020 



Glu val Thr val Glu xyr Ala Ala Glu Leu Phe Ser Ala Ala Thr 
1025 1030 



1035 



^9 Slu Arg Met 

1050 



1040 1045 



Leu Glu Glu Pro Arg Cys Arg Leu Ser Asp Phe Ser Leu Pro Val 
"55 1060 1065 



Ala Arg Thr Glu Phe Thr Pro His Thr Leu Asp Thr Ser Arg Ser 

1075 1080 



Val Leu Asp Leu Phe Asp Ala Gin Val Ala Leu His Ala Glu Glu 
1085 1090 1095 



Leu Ala cys Ala Asp Gin His Arg Gin Leu Ser Tyr Ala Gin Leu 
1100 1105 1110 



Asp Gin His Ala Asn Arg Leu Ala His Cys Leu He Glu Arg Gly 
Ills 1120 1125 



236 AppltcatiOD of Royer, et al. 

Leu Arg Pro Gin Glu Arg Val Ala Leu Trp Phe Gly Arg Ser Pro 
1130 1135 3^3L4Q 



Asp Phe Leu lie Ala Leu Leu Gly Val Leu Lys Ala Gly Gly Cys 
1145 1150 1155 



Tyr Val Pro Leu Asp Pro His Tyr Pro Thr Thr Tyr lie Gin Gin 
1160 1165 1170 



He Leu Asp Asp Ala Gin Pro Arg Leu Leu Leu Cys Gly Lys Asp 
1175 1180 1185 



He Asp Gly Gin Leu He Gin Val Pro Arg Leu Arg Leu Asp Asp 
11^0 1195 1200 



Ala Ala He Ala Arg Gin Pro His Thr Pro Leu Pro His Ala Leu 
1205 1210 1215 



His Pro Ala Gin Leu Ala Tyr Val Met Tyr Thr Ser Gly Ser Thr 
1220 1225 1230 



Gly Arg Pro Lys Gly Val Met Val Pro His Arg Gin He Leu Asn 
1235 1240 1245 



Trp Leu His Ala Leu Trp Ala Arg Ala Pro Phe Glu Ala Gly Lys 
1250 1255 1260 



Arg Val Ala Gin Lys Thr Ser He Ala Phe Ala He Ser Val Lys 
1265 1270 1275 



Glu Leu Leu Ala Gly Leu Leu Ala Gly Val Pro Gin Val Phe He 



1280 1285 



23 7 Application of Royer. et al. 

1290 



Asp Glu Asp Thr Val Arg Asp lie Pro Ala Phe Val Arg Ala Leu 
"95 1300 1305 



Glu Thr Trp Gin lie Thr Arg Leu Tyr Thr Phe Pro Ser Gin Leu 
1310 1315 1320 



Asn Ala Leu Leu Asp His Val Ala Glu Thr Pro Gin Arg Leu Ala 
"25 1330 1335 



Arg Leu Arg Gin Leu Phe Val Ser lie Glu Pro Cys Pro Ala Glu 
"40 1345 1350 



Leu Leu Gin Arg Leu Arg Thr Leu Leu Pro Ala Cys Thr Ala Trp 
"55 1360 1365 



Tyr He Tyr Gly Cys Thr Glu He Asn Asp Met Thr Tyr Cys Asp 
"70 1375 1380 



Pro Ala Glu Gin His Ser Gly Ser Gly Phe Val Pro Val Gly Arg 
1385 1390 1395 



Pro He Ala Asn Thr Lys Val His Val Leu Asp Glu Gin Leu Arg 
1400 1405 1410 



Pro Leu Pro Pro Gly He Met Gly Glu Val His He Glu Ser Leu 
1415 1420 1425 



Gly He Thr His Gly Tyr Trp Arg Gin Gly Gly Leu Thr Ala Ala 
1430 1435 1440 



20 



25 
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Arg Phe He Ala Asn Pro Tyr Gly Pro Pro Gly Ser Tirg Leu Tyr 
1445 1450 1455 



Arg Thr Gly Asp Met Ala Arg Leu Leu Asp Asn Gly Thr Leu Glu 
14^0 1465 1470 



10 Leu Leu Gly Arg Arg Asp Tyr Glu Val Lys Val Arg Gly Tyr Arg 

1475 1480 1485 



Val Asp Val Arg Gin Val Glu Lys Ala Leu Ala Ala His Leu Gin 
^5 3.490 1495 1500 



Val Ala Glu Ala Ala Val He Gly Trp Pro Gin Gly Ser Pro Thr 
1505 1510 1515 



Pro Glu Leu Leu Ala Tyr Val Val Pro Arg Gin Gly Val Leu Asn 
1520 1525 1530 



Leu Asp Glu Leu Arg Lys Leu Leu Gin Glu Arg Leu Pro Thr Tyr 
1535 1540 1545 



30 Met Leu Pro Thr Arg Phe Gin Ser Leu Pro Ala Leu Pro Arg Leu 

1550 1555 1560 



Pro Asn Gly Lys Leu Asp Thr Leu Ser Leu Pro Glu Pro Gin Ala 
35 1565 1570 1575 



40 



Ala Ser Ser Asp Ser Asp Tyr Leu Ala Pro Arg Ser Glu Val Glu 
1580 1585 1590 
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He Thr Leu Ala Lys Leu Trp Ser Glu Leu Leu Thr Pro Ala Gin 
1S95 1600 1605 



Ala Ala Pro Leu Arg Val Ser Leu 
1610 leis 



Asn Asp .Asn Phe Phe Asn Leu 
1620 



Gly Gly His Ser Leu Leu Ala Thr Gin Leu Phe Ser Arg He Arg 
1625 1630 1535 



Gin Ser Phe Asp He Glu Val Arg Val Asn Thr Leu Phe Glu Ser 

16S0 



1640 1645 



Pro Val Leu Glu Asp Phe Ala Ala Val Val Glu Arg Gly Met Arg 
^^^5 1660 1665 



Gin Ser Gin Ala Gly Ser Met Pro Val Ser Leu He Val Pro Leu 

1675 1680 



Ser Leu Arg Thr Glu Arg Ala Ala Val ryr Ala He His Pro He 
1685 1690 1695 



Gly Gly Gin He His Cys Tyr He Asp Leu Ala Ala Ala Leu Gly 
"00 1705 1710 



His ser Ala Arg Val Tyr Gly Leu Gin Cys Glu Pro Val Arg Arg 
1715 1720 1725 



Phe Ala His Leu Ser Asp Leu Ala Ala His Tyr Cys Asp Ala Leu 
1730 1735 1740 



Leu Ala Gly Pro Thr Gly Ala Pro Tyr Arg Leu Leu aiy Trp Ser 
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1745 1750 1755 



Ser Gly Gly Val Leu Ala Leu Ala Val Ala Glu Gin Leu Gin Arg 
5 1760 1765 1770 



Arg Gly Leu Arg Val Asp Tyr Val Gly Leu Leu Asp Ser Ser Leu 
1775 1780 1785 

10 

He Pro Val His Ala Arg Glu Pro Arg Gin Leu Thr Phe Val Ala 
1790 1795 1800 

15 

Ala Leu Asn Thr Leu Ala Ala Leu Ala Lys Arg Gly Phe Glu Gin 
1805 1810 1815 



20 Ala Glu He Asp Glu Ala Arg Gin Leu Leu Phe Ala Asp Gly Asp 

1820 1825 1830 



Asp Glu His Val Phe Asp Tyr Ser Arg His Gin Ala Ser Leu Asp 
25 1835 1840 1845 



Lys Leu Leu Ala His Leu Arg Phe Thr Leu Glu Ser Arg Met Trp 
1850 1855 I860 

30 

Pro Pro Leu Ala Qlu Gin Leu Arg Val Thr Arg Tyr His Leu Gly 
1865 1870 1875 

35 

Leu Leu Ala Gly Phe Glu Pro Gin Cys Leu Gin Pro Asn Ala His 
1880 1885 1890 



40 Leu Tyr Gin Ala Gin Thr Ala Val His Val Ser Tyr Ala Asp Met 

1895 1900 1905 



5 



35 



24 1 Application of Royer, ct al. 



Ser Lya Pro Arg Gly Gly Ser Glu Val Leu Pro Asp lie Thr Gly 
1910 1915 1920 



Tyr Val Pro Leu Ser Gin lie Lys Ser Ala Ala Gly Asn His Tyr 
1925 1930 1935 



10 Ser Met Leu Gin Gly Asp Pro Leu Arg Glu Leu Ala Arg Met Leu 

1940 1945 1950 



Val Thr Asp Leu Asp Ala 
IS 1955 



<210> 35 

<211> 83 

20 <212> PRT 

<213> Xanthomonas albllineans 

<400> 35 

25 Met Thr Phe Glu Glu Gin Ala Tyr Leu Val Leu lie Asn Asp Glu Leu 

IS 10 15 



Gin Tyr Ser Leu Trp Pro Ser Asp Leu Glu Val Pro Pro Gly Trp Arg 
30 20 25 30 



Lys Glu Gly Tyr Ala Gly Gly Lys Asp Glu Cys Met Ala Tyr lie Asp 
35 40 45 



Glu Thr Trp Thr Asp Met Arg Pro Leu Ser Leu Arg Glu Leu Asp Asp 
50 55 60 



40 



Lys Asn Leu Gly Asp Ala Ser Ser Pro Asp Gly Ser Gly Phe Glu Ser 



1^ 
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" 70 75 80 



Ser Tyr Ser 



<210> 36 

<211> 315 

<212> PRT 

<213> Xanthomonas albilineans 

<400> 36 

Met Gly Cys Ala Cys Leu Pro His Tyr Leu Glu Lys Gin Asp Leu Ser 
1.5 10 15 



Ala Leu Asp Asp Ala Leu Ala Gly Val Arg Leu Ser Gin Tyr Cys Thr 
20 25 30 



Thr Asp Gly Arg Gin Leu Glu Leu Tyr Trp Leu Gly Ala Gin Ala Ser 
35 40 45 



Pro Lys Leu Val Leu Leu Pro Pro Tyr Gly Met Ser Tyr Leu Leu Leu 
SO 55 60 



Ser Arg Leu Ala Gin Arg. Leu Ala Arg His Phe His Val Leu Cys Tzp 
^5 70 75 80 



Glu Ser lie Gly Cys Pro Asn Ala Gin Thr Ser Val Thr Ala Glu Asp 
85 90 95 



Phe Asp Leu Asp Arg Gin Ala Ala Thr Leu Leu Gly lie Leu His Gin 
"0 105 110 
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His Asp Tyr Ala Asp Cys His Phe Val Gly Trp Cys Gin Ala Ala Gin 

120 „S 

5 Leu Ala val His Ala lie Ala Leu His Oly Phe Ala Pro Arg Ser Met 



"0 135 



140 



10 MS ^"^^ ^''^ ^''^ Gi« 



^50 icq 



Pbe Glu Arg Cys Ala Leu Pro He Tyr Leu Gin He Glu Arg His Gly 
15 "° "5 



Leu Glu Gin Ala Lys Lys Leu Ala Ala He Leu Asp Lys Tyr Arg Gly 

185 



20 



Gin Pro Leu Arg Gly Asp Asp Leu Ala Glu Lys Leu Thr Met Leu His 
"5 200 205 

25 Leu Ala Asp Pro Ala Ser Thr Leu Val Phe Ser Arg Tyr Met Arg Ala 

210 215 220 



Tyr Glu Glu Asn Lys Gin Ser Val Gin Ala Leu Leu Pro tbr Ala Leu 



30 225 

240 



230 235 



Gly Arg His Pro Thr Leu lie Val His Cys Lys Asp Asp ser Phe Ser 
35 250 255 



His Tyr ser Ala Ser Val Gin Leu Ala Arg His Asp Pro Ser Leu Arg 

40 



265 270 



Leu Asp Leu Leu Asp His Gly Gly His Leu Gin Leu Phe Asn Asp 



Pro 



«d u St3 ^ 4U 
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275 280 285 



Gly Ala Val Ala Gin Arg lie lie Asp Phe He Gly Leu Thr Val Gly 
290 295 300 



Glu Val Ala Pro Thr Ser Met His Ser Ala Ala 
305 310 315 



<210> 37 
<211> 451 
<212> PRT 

<:213> Xanthomonas albilineans 
<400> 37 

Met Tyr He Pro Asn Asn He Asp Leu Asp Pro His Ser Ala Leu Val 
15 10 15 



Arg Gin Leu Thr Ser Tyr Gin Val Arg Phe Leu Gin Trp Trp Arg Leu 
20 25 30 



Arg Gly Pro Ser Glu Phe His Asp Arg Glu Met Asn Leu Arg Met Pro 
35 40 45 



Thr Gly Gly Val Lys Gly Ser Glu Trp Thr Arg Tyr His Arg Met Arg 
50 55 60 



Pro Ser Asp Tyr Arg Trp Gly Val Phe Met Met Pro Pro Asp Arg Asn 
65 70 75 80 



Thr Val Val Phe Gly Glu Arg Lys Gly Gin Val Ala Trp Ser Cys Val 
fiS 90 95 
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Pro Glu Glu Tyr Arg Asp Leu Leu Leu Asp His Val Thr Val Gin Gly 
- 100 105 110 



Asp Val Glu Asn Ala Ala Val Glu Gin Ser His Glu Leu Thr Gin Met 
lis 120 125 



Val Pro Ser Ala lie Asp Leu Glu His Leu Phe Gin Phe Phe Leu Glu 
130 135 140 



Glu Gly Arg His Thr Trp Ala Met Ser His Leu Leu lie Glu Tyr Phe 
"5 ISO 155 160 



Gly Ser Asp Gly Ala Asp Ala Ala Glu Gly Leu Leu Gin Arg Met Ser 
165 170 175 



Gly Asp Ala Gin Asn Pro Arg Leu Leu Asp Ala Phe Asn Tyr His Thr 
180 185 190 



Glu Asp Trp Leu Ser His Phe Met Trp Cys Phe Phe Ala Asp Arg Val 
195 200 205 



Gly Lys Tyr Gin lie Gin Ala Val Thr Gin Ser Ala Phe Leu Pro Leu 
210 215 220 



Ala Arg Thr Ala Arg Phe Met Met Phe Glu Glu Pro Leu His He Lys 
225 230 235 240 



Phe Gly Val Asp Gly Leu Glu Arg Val Leu Tyr Arg Ser Ala Glu He 
245 250 255 



Thr Leu Arg Glu Asp Thr His Ala He Phe Asp Ala Gly Ala He Pro 



5 



10 



IS 



25 



30 
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265 270 



Leu Pro val Val Gin Lys Tyr I«u Asn Tyr Trp Leu Pro Lys lie Phe 
275 280 285 



Asp Leu Phe Gly His Asp Val Ser Glu Arg Ser Arg Val Leu Tyr Gin 
290 295 300 



Ala Gly He Arg Ser Pro Arg Asn Phe Asp Lys Leu Glu Gly Thr Glu 
^""^ 310 



Val Ala val Asp Val Arg Cys Glu Asp Arg Leu Val Ser Ser Thr Ala 
325 330 

t 

20 Pro Ala Glu Leu Ala He Asn Ala Val Met Arg Arg Gin Tyr He Ala 

340 345 350 



Glu val Gly Ala He He Gly Arg Trp Asn Gin Gin Leu Arg Arg Leu 
355 360 365 



Gly Leu Ala Phe Glu Leu Gin Leu Pro His Glu Arg Phe His Arg Asp 
370 375 380 



Phe Gly Pro Cys Lys Gly Leu Ala Phe Asp Leu Asp Gly Asn Pro Val 

35 



38S 390 395 



His Asp Ala Asp Gly Gin Arg Leu Ala Ala Leu Leu Pro Thr Pro Gin 
*0S. 410 415 



40 Asp Leu Ala Gly Val Arg Gly Leu Met Gly Arg Glu Leu Gly Glu Gly 

420 425 430 • 
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Arg Thr Ala Val Trp Leu Ala Pro Ala Gly Ala Ser Leu Asp Lys Leu 
*3S 440 

5 

Met Pro Ala 
450 

10 <210> 38 

<2U> 317 
<212> PRT 

•e2l3> Xanthomonas albilineans 
15 <400> 38 

Met Asn Ser Tyr val Gly Cys Gin Lys Leu Glu Thr Asp Gly Asp Ala 

20 



^ = XO 15 



Ser Arg Val Val Pro Met Trp Val Met Tyr Pro Thr Ala Thr Pro Ser 
20 25 30 

25 Arg Asp Thr Ala Met Gly Pro Tyr Thr Leu Asp Val Ala Leu Gly Ala 

35 40 45 



Pro He Glu Ala Gly Pro Phe Pro Leu Ala Val He Ser His Gly Thr 
30 50 55 go 



Arg ser Ala Gly Leu Val Phe Arg Thr Leu Ala His Tyr Leu Ala Arg 
35 " 

His Gly Phe lie Val Ala Leu Pro Glu His Pro Gly Asp Asn Leu Phe 
85 90 95 

40 

Gin His Gin Leu Glu Tjrr Ser Tyr Gin Asn Leu Glu Asp Arg Pro Arg 



10 



15 



30 



35 



3 a J. Oils CSS 



248 Application of Royer, et al 

100 105 110 



His lie Arg Ala Val lie Asp Thr Leu Thr Gly His Ala Gin Phe Gly 
^ 115 120 125 



Pro Ala He Gin Ala His Asn Val Ala Val He Gly His Ser Val Gly 
130 135 140 



Gly Tyr Thr Ala Leu Ala He Ala Gly Gly Glu Pro His Thr Gly Phe 
"5 150 155 160 



Met Val Asp Phe Ala His Arg Pro Glu His Ala Glu Gin Pro Ala Trp 
165 170 175 



20 Thr Ala Leu Val Arg Gin Asn Arg Val Pro He Arg Ala Val Pro Val 

180 185 190 



Thr Ala Asp Pro Arg Val Arg Ala Val Val Ala Leu Ala Pro Asp Phe 
25 195 200 205 



Ser Leu Tyr Met His Glu Asp Ala Leu Ala Lys Val Glu Val Pro Val 
210 215 220 



Leu Leu He Val Gly Glu Lys Asp Gin Trp Ala His Glu Thr He Val 
^25 230 235 240 



Ala Thr Arg Thr Ala Leu Gly Asn Asp Gly Arg Leu Glu Ala Arg Val 
245 ' 250 255 



40 



Val Pro Asn Ala Gly His Tyr Ala Phe He Ser Val Phe Pro Glu Ala 
260 265 270 



249 Application of Royer, et a!. 



Met Lys Ala Arg Val Gly Glu Ala Ala He Asp Pro Pro Gly Phe Asp 
275 280 285 



Arg Ser Ala Phe Gin Arg Glu Leu Glu Arg Asp He Leu His Phe Leu 
290 295 300 



Thr Val Thr Met Arg Pro Ala Glu Ala Ala He Ser Gly 
305 310 315 



<210> 39 
<211> 496 
<212> PRT 

<213> Xanthomonas albillneans 
<400> 39 

Met Gin Lys Pro Lys Glu Ala Leu Gly Met Pro Pro Gly Met Ala Pro 
15 10 15 



Pro Gly Ala Gin Phe Asp Tyr Arg Trp Arg Tip Pro Ala Met He Val 
20 25 30 



Leu Leu Ser Ala Asn Phe Met Asn Leu Leu Asp Val Gly He Val Asn 
35 40 45 



Val Ala Leu Pro Ser He Gin Lys Asn Leu Gly Ala Asp Glu Gin Gin 
50 55 60 



Leu Glu Trp He Val Ala He Tyr He Leu Leu Phe Ala Leu Gly Leu 
65 70 75 80 



Leu Pro Leu Gly Arg Leu Gly Asp Met Leu Gly Arg Lys Arg Met Phe 



10 



15 



250 Application of Royer, et al 

85 90 95 



Gly Thr Gly Val Ala Gly Phe lie Leu Met Ser Ala Phe Cys Ala lie 
^ "0 105 110 



Ala Gly Asn lie His Val Leu He He Ala Arg Ala Leu Glii Gly Leu 

120 125 



Ala Ala Ala Met Leu Ala Pro Gin Val Met Ala lie Ala Gin Thr Met 
130 135 140 



Phe Ala Pro Lys Glu Arg Ala Ala Ala Phe Ser Leu Phe Gly Leu Val 

150 155 160 



20 Ala Gly Leu Ala Ser Phe Ala Gly Pro Leu Val Ser Gly Leu Leu He 

165 170 175 



His He Asp Ala Phe Gly Val Gly Trp Arg Ala He Phe Leu He Asn 
25 180 185 190 



30 



35 



Val Pro He Gly Leu Val Thr Leu Leu Ala Ala Ala He Trp Val Pro 
195 200 205 



Lys Val Pro Ala His Ala Gly He His Asn Asp Trp Val Gly He Ala 
210 215 220 



Leu Ala Ala Leu Ala Leu Leu Cys Leu Val Phe Pro Leu He Glu Gly 
225 230 235 240 



40 



Arg Ala Tyr Gly Trp Pro Leu Trp Cys Phe Ala Ala He Ala Leu Gly 
245 250 255 



He Pro Leu Leu Val Ala Phe Val Ala 



251 Application of Roy«r, et al. 



Trp Gin Arg Arg Gin Ala His 



260 265 270 



Leu Ala Arg Pro Ala Leu Leu Pro He Tyr Leu Met Ser His Arg Asp 
275 280 28S 



Tyr He Leu Gly Ala Leu Ser Val Ser Val Phe Tyr Ser Ala Leu Gin 

295 300 



Gly Phe Phe Leu Val Phe Val He Phe Leu Gin Gin Gly Leu Ala Tyr 

315 320 



Ser Ala Leu Glu Thr Gly Val Ala Thr Thr Pro Phe Pro Val Gly Val 
325 330 



Ala He Ala Ser Met Leu Ala Arg His Val Glu Ser Leu Arg Ala Lys 
340 345 350 



He Phe Ser Gly Ala Cys Leu Met He Ala Ser Tyr Leu Ala Leu Trp 
355 3S0 365 



val He He Thr Arg Ser Glu Gly Ser Leu Asp Pro Trp Thr Leu Thr 
"0 375 380 



Leu Pro Leu Leu He Gly Gly Leu Gly Cys Gly He Thr He Al 
385 390 395 



a Ser 
400 



Leu Phe Gin Thr Val Met Arg Thr Val Pro Leu Lys Asp Ala Gly Ala 

410 



15. 



20 



25 



30 



35 



252 Application of Royer, ct al 



Gly Ser Gly Ala Leu Gin Val lie Gin Gin Val Gly Gly Met Leu 



420 425 



Gly 
430 



5 He Ala Leu Val Ser Glu He Phe Phe Ser Gly Leu His Gin His ^eu 



435 



445 



Gin Gly Pro Ala Gly Val Ala Leu Ala Phe Lys Gin Ala Phe Gly Ala 
450 455 



460 



Thr val Val lyr Tyr He Ala Ala Asn Ala Phe Val Ala Leu Ser Thr 

480 



470 475 



Leu Gly Leu Gin Phe Lys Leu Thr Gin Phe Ala Pro Gin Ser Ser Pro 

495 



485 490 



<210> 40 

<211> 584 

<212> PRT 

<213> Xanthomonas albilineans 

<400> 40 



Met Lys Arg Thr Tyr He Gly Leu Ala Asn Ser Phe His Asp Ser Ala 
^ ^ 10 15 



He Ala He Val Gly Asp Asp Gly Gin Val Arg Phe Ala Glu Ala Thr 

25 30 



Glu Arg Tyr Leu Gin Tyr Lys Arg Ser He Gly Val Ala Pro Asp Val 
35 40 45 



40 Phe Gin Arg Ala He Lys Leu Val His Glu Tyr Gly Asp Pro Gly Ala 

S° 55 60 



15 



20 



25 



253 Applicabon of Royer. et al 



Glu Leu Val Val Ala Thr Ser Trp Ser Gly Gin Thr Pro Glu I«u Met 
" 75 80 



Arg Glu Gly Leu Gly Lys Thr Ala Gin Ala Val Asp Gin Tyr Arg Ser 
85 90 95 



10 Ala Phe Gly Asp Leu Pro Trp His Val Asn Lys Gin Phe Val Ala Gin 

105 no 



ser Phe Phe Tyr Arg Ser Gin Leu Ala Met Val Glu His Pro Gly His 
115 120 125 



Leu Leu Glu Tyr Asp Leu Ser His Met Ala Glu Pro Ala Phe Lys Pro 
130 135 140 



Pro Ser Tyr Arg His Tyr Glu His His Leu Thr His Ala Val Ala Gly 

ISO 155 160 



Cys Tyr Thr Ser Pro Phe Glu Glu Ala Val Cys Ala Val Leu Asp Gly 
16S 170 175 



30 Met Gly Glu Lys Asn Ala 



Leu Ala Cys Tyr His Tyr Gin Gin Gly Lys 
180 IBS 190 



Leu Thr Pro lie His Gin Ser Glu Thr Ser Ser Trp Ala Ser Leu Gly 
195 200 205 



40 



Phe Phe Tyr Gly Met lie Cys Glu Val Cys Gly Phe Gly Thr Leu Ser 
210 215 220 



254 Applicauon ofRoyer. et al 



Gly Glu Glu Xrp Lye Val Met Gly Leu Ala Ala Tyr Gly Qln His 



225 „ft - - *^ 

240 



230 235 



5 Arg Gin Leu Tyr Glu Leu Leu Arg Gin Met Leu Arg Val Asp Gly Leu 

245 250 255 



10 



Thr Leu Arg Phe Ala Pro Ala Ala Gin Phe Ser Gin Leu Gin Arg Thr 

265 270 



15 



Leu Tyr Ala Met Arg Arg Cys Lys Gly Gin Pro Thr lie Glu 
275 280 285 



Leu Ala 



Asn Leu Ala Tyr Ala Gly Gin Gin Val Phe Cys Asp Val Leu Phe Glu 

295 



20 



Phe Leu His Asn Leu His Ala Leu Gly Leu Ser Asp His Leu Val Leu 

315 320 



305 -310 



25 Gly Gly Gly Cys Ala Leu Asn Ser Ser Ala Asn Gly Arg Val Leu Ala 



325 



335 



Glu Thr Pro Phe Arg His Leu His Val Phe Ala Ala Pro Gly Asp Asp 
3*0 345 350 



Gly Asn Ala Val Gly Ala Ala Leu Trp Ala His Ala Glu Asp His Pro 
33 360 365 

Glu Gin Thr Pro Pro Ala Ala Arg Glu Gin Ser Pro Tyr Leu Gly Ser 
"0 375 380 

40 

ser Met Ser Ala Glu Thr Leu His Asn Val Glu Arg Phe Gly Ala Leu 



I 



2S 5 Application of Royer, et al 

385 390 395 400 



Ser Lys Phe Thr Arg Cys Leu Asp Asp Ala Ala Gin Arg Ala Ala Arg 
405 410 415 



Leu lieu Thr Glu Gly Lys lie Val Ala Txp Val Gin Gly Arg Ala Glu 
420 425 430 



Phe Gly Pro Arg Ala Leu Gly Asn Arg Ser He Leu Ala Asp Pro Arg 
435 440 445 



Ser Pro Ala He Lys Asp He He Asn Ala Arg Val Lys Phe Arg Glu 
450 455 460 



Glu Phe Arg Pro Phe Ala Pro Ser He Leu His Glu His Gly Ala Glu 
465 ' 470 475 480 



Tyr Phe Glu Leu Tyr Gin Glu Ser Pro Tyr Met Glu Arg Thr Leu Lys 
485 490 495 



Phe Arg Ala Glu Ala Thr Arg Lys Val Pro Gly Val Val His His Asp 
500 505 510 



Gly Thr Gly Arg Leu Gin Thr Val Lys Gin His Trp Asn Pro Arg Tyr 
515 520 525 



His Ala Leu He Lys Glu Phe Tyr Arg Leu Thr Gly He Pro Leu Val 
530 535 540 



Leu Asn Thr Ser Phe Asn Val Met Gly Lys Pro He Ala His Ser Val 
545 550 555 560 



256 Application of Royer, et aJ 



Glu Asp Ala Leu Ser lie Phe Phe Thr Ser Gly Leu Asp Ala Met Phe 

575 • 



565 



lie Asp Asp Val Leu He Glu Lys 
580 



<210> 41 

<211> 88 

<212> PRT 

<213> Xanthomoaas albilineans 

<400> 41 



Met Arg Thr Ser Lys Phe Aan Glu Thr Gin He He Ala Thr Leu Lys 

1 s ^« ' 



10 15 



Gin Ala Asp Ala Gly Val Pro Val Lys Asp He Cys Arg Gin Val Gly 



20 25 



30 



He Ser Thr Ala Thr Tyr Tyr Gin 



Trp Lys Ser Lys Tyr Val Ala Ser 
35 . 40 45 



Glu Met Pro Ser Ser Arg His . Thr Ser Leu Thr Trp Arg Pro Pro Ser 

55 eo 



Thr cys Phe Ser Val Ala Thr He Trp Leu Ser Val Asn Leu Leu Leu 



75 80 



65 70 

80 

f 

Arg He Val Gly Arg Leu Gly Gly 
85 



<210> 42 



257 Application of Royer, et al. 

<211> 716 
<212> PRT 

<213> Xanthomonas albilineans 
<400> 42 

Met Arg Cys Ijeu lie lie Asn Asn Tyr Asp Ser Phe Thr Trp Asn Leu 
15 10 15 , 



Ala Asp Tyr Val Ala Gin lie Phe Gly Glu Asp Pro Leu Val Val His 
20 25 30 



Asn Asp Glu Tyr Ser Trp His Glu Leu Lys Asp Arg Gly Gly Phe Ser 
35 40 45 



Ser He He Val Ser Pro Gly Pro Gly Ser Val Val Asn Glu Ala Asp 
50 55 60 



Phe His He Ser Leu Gin Ala Leu Glu Gin Asn Glu Phe Pro Val Leu 
65 70 75 80 



Gly Val Cys Leu Gly Phe Gin Gly Leu Ala His Val Tyr Gly Gly Arg 
85 90 95 



He Leu His Ala Pro Val Pro Phe His Gly Arg Arg Ser Thr Val He 
100 105 110 



Asn Thr Gly Asp Gly Leu Phe Glu Gly He Pro Gin Arg Phe Glu Ala 
115 120 125 



Val Arg Tyr His Ser Leu Met Val Cys Gin Gin Ser Leu Pro Pro Val 
130 135 140 



10 



15 



20 



30 



258 Apphcataon of Royer, et al. 

Leu Lys Val rbx Ala Arg Thr Asp Cys Gly Val Val Met Gly Leu Gin 

ISO 

5 His Val Gin His Pro Lys Trp Gly Val Glti Phe His Pro Glu Ser lie 

165 170 3^75 



Leu Thr Glu His Gly Lys Arg He Val Ala Asn Phe Ala Lys Leu Ala 
"0 185 190 



Ala Arg His Ser Ala Pro Leu Leu Ala Gly Ser Glu Gin Ala Gly Lys 
195 200 205 



Val Leu Ser Val Cys Ala Pro Glu Met Val Thr Pro Arg Val Arg Arg 
210 215 220 



Met Leu Ser Arg Lys He Lys Cys Arg Trp Gin Ala Glu Asp Val Phe 
"5 230 235 240 



25 Leu Ala Leu Phe Ala Asp Glu Lys His Cys Phe Trp Leu Asp Ser Gin 

245 250 . 255 



Leu Val Cys Ser Pro Met Ala Arg Tyr Ser Phe Met Gly Ala Val Asn 
260 265 270 



Glu Ser Glu Val Val Arg His Cys Val Arg Pro Gly Ser Met Val Gin 

35 

Glu Ala Gly Glu Arg Phe Leu Ala Glu Met Asp Arg Ala Leu Gin Ser 
290 295 300 

40 

Val Leu Thr Glu Asp Val Ala Glu Arg Pro Pro Phe Ala Phe Arg Gly 



# 



259 Application of Royer. et al 



305 



320 



Gly Tyr Val Gly Tyr Met Ser Tyr Glu Met Lys Ser Val Phe Gly Ala 
325 330 



Pro Ala ser His Ala Asn Ala He Pro Asp Ala Leu Trp Met Arg Val 

345 



Glu Arg Phe Val Ala Phe Asp His Ala Thr Glu Glu Val Trp Leu Leu 

365 



355 360 



Ala Leu Ala Asp Thr Glu Asp Leu Ser Ala Leu Ala Trp Leu Asp Ala 

375 380 



lie Glu Gin Arg He His Ala He Gly Gin Ala Ala Pro Ala Cys He 

400 



385 390 



Ser Leu Gly Leu Arg Ser Met Glu He Glu Leu Asn His Gly Arg Arg 

410 415 



Gly Tyr Leu Glu Ala He Glu Arg Cys Lys Gin Arg He Val Asp Gly 
*20 425 430 



Glu ser Tyr Glu He Cys Leu Thr Asp Leu Phe Ser Phe Gin Ala Glu 
435 440 445 



Leu Asp Pro Leu Met Leu Tyr Arg Tyr Met Arg Arg Gly Asn Pro Ala 

455 460 



Pro Phe Gly Ala Tyr Leu Arg Asn Gly Ser Asp Cys He Leu Ser Thr 
"5 470 ^3^, 



BP ^ -2? -to ir-b TT* {Ti*^ 



5 



25 



260 Application of Royer, et al. 



Ser Pro Glu Arg Phe Leu Glu Val Asp Gly His Gly Thr lie Gin Thr 
485 490 495 



Lys Pro lie Lys Gly Thr Cys Arg Arg Ala Glu Asp Pro Gin Leu Asp 
500 505 510 



10 Arg Asn Leu Ala Met Arg Leu Ala Ala Ser Glu Lys Asp Arg Ala Glu 

515 520 525 



Asn Leu Met lie Val Asp Leu Met Arg Asn Asp Leu Ser Arg Val Ala 
IS 530 535 540 



Val Pro Gly Ser Val Thr Val Pro Lys Leu Met Asp lie Glu Ser Tyr 
545 550 555 560 

20 

Lys Thr Val His Gin Met Val Ser Thr Val Glu Ala Arg Leu Arg Ala 
565 570 575 



Asp Cys Ser Leu Val Asp Leu Leu Lys Ala Val Phe Pro Gly Gly Ser 
580 585 590 



30 He Thr Gly Ala Pro Lys Leu Arg Ser Met Glu He He Asp Gly Leu 

595 600 605 



Glu Asn Ala Pro Arg Gly Val Tyr Cys Gly Ser He Gly Tyr Leu Gly 
35 610 615 620 



Tyr Asn Cys Val Ala Asp Leu Asn He Ala He Arg Ser Leu Ser Tyr 
625 630 635 640 

40 



26 1 Appltcaboo of Royer, et aj 

Asp Gly Gin Glu He Arg Ehe Gly Ala Gly Gly Ala He Tlxr Phe Leu 
645 650 

5 Ser Asp Pro Gin Asp Glu Phe Asp Glu Val Leu Leu Lys Ala Glu Ala 

660 665 670 



10 



15 



20 



He Leu Lys Pro He Trp His Tyr Leu His Ala Pro Asn Thr Pro Leu 
"5 680 685 



His Tyr Glu Leu Arg Glu Asp Lys Leu Leu Leu Ala Glu His Cys Val 

S95 700 



Ser Glu Met Pro Ala Arg Gin Ala Phe He Glu Pro 
70S 710 715 



<210> 43 
<211> 137 
<212> PRT 

<213> Xanthomonas albilineans 

25 

<400> 43 

Met Arg Pro Pro Arg Leu Arg Ala Asn Gin Asp Gly Leu Leu Met Asp 
30 ' 

Thr Ala Gly Arg Val Val Glu Gly cys Thr Ser Asn Leu Phe Leu Val 
20 25 30 



35 



Glu Asn Gly His Leu Val Thr Pro Asp Leu Gly Val Ala Gly Val Ser 
3S 40 45 



40 Gly He Met Arg Gly 



Arg val He Glu Tyr Gly Arg Gin His Gly Leu 



S° 55 60 



262 Applieation of Royer. est al. 

Ala Cys Ala Val Lye Hie Val Tyr Pro Asp Gin Leu Val Arg Ala Gin 



70 



75 



80 



Glu val Phe Leu Thr Asn Ala Val Phe Gly He Leu Leu Val Arg Ser 
85 9Q 



95 



10 lie ASP Ala HlB ser Tyr Arg He Asp Pro Val Thr Leu Arg Leu Leu 



"° 105 

( 



15 ^5 ''^^ ^ Arg Ser Leu His Gin 



120 



125 



20 



val ser Thr His Ala Gly Gin Asp Pro 
130 



<210> 44 

<211> 200 

<2X2> PRT 

25 <213> Xanthomonas albilineans 

<400> 44 



30 



Met Pro Ala Lys Thr Leu Glu Ser Lys Asp Tyr Cys Gly Glu Ser Phe 



10 15 



val ser Glu Asp Arg Ser Gly Gin Ser Leu Glu Ser He Arg Phe Glu 
35 30 



ASP cys Thr Phe Arg Gin Cys Asn Phe Thr Glu Ala Glu Leu Asn Arg 
35 40 45 

40 

Cys Lys Phe Arg Glu cys Glu Phe Val Asp Cys Asn Leu Ser Leu He 



10 



15 



263 AppHcation of Royer, et al. 

=0 55 60 



Ser lie Pro Gin Thr Ser Phe Met Glu Val Arg Phe Val Asp Cys Lys 
^ " 70 75 80 



Met Leu Gly Val Asn Trp Thr Ser Ala Gin Trp Pro Ser Val Lys Met 
85 90 95 



Glu Gly Ala Leu Ser Phe Glu Arg Cys He Leu Asn Asp Ser Leu Phe 

105 



Tyr Gly Leu Tyr Leu Ala Gly Val Lys Met Val Glu Cys Arg He His 
lis ' 120 125 



20 Asp Ala Asn Phe Thr Glu Ala Asp Cys Glu Asp Ala Asp Phe Thr Gin 

135 3^40 



Ser Asp Leu Lys Gly Ser Thr Phe His Asn Thr Lys Leu Thr Gly Ala 

155 160 



25 145 ISO 



Ser Phe He Asp Ala Val Asn Tyr His He Asp He Phe His Asn Asp 

He Lys Arg Ala Arg Phe Ser Leu Pro Glu Ala Ala Ser Leu Leu Asn 
180 185 190 

35 

Ser Leu Asp He Glu Leu ser Asp 
19S 200 



40 <210> 45 

<211> 202 



264 Application of Royer.etal. 



<212> PRT 

<213> Xanthomonas arbilineans 

V. 

<400> 45 



5 



Met His Pro Pro Ser Pro Leu Asn Thr Gin Gin Lys Asp Trp Leu Thr 
^5 10 15 



10 Arg Gly Gly Ser Leu Thr Ala His Leu Arg Leu Leu Gly Gin Val 



20 25 30 



Gin 



Val Gin Val Gin Arg Olu His Lys Ser Met Ala Trp Leu Asp Glu Tyr 

40 45 



20 



Arg val Leu ©ly Leu Ser Arg cys Leu Leu Val Trp val Arg Glu 
^° 55 60 



val Leu val Val Asp Ala Lys Pro Tyr Val Tyr Ala Arg Ser Leu 



Val 



65 70 75 



Thr 
80 

25 



Pro Leu Thr Ala ser Tyr Asn Ala Trp Gin Ala val Arg Ser He Gly 
85 90 95 

30 ser Arg Pro Leu Ala Asp Leu Leu Phe Arg Asp Arg Ser Val Leu Arg 



100 



110 



ser Ala Leu Ala Ser Arg Arg He Thr Ala Gin His Pro Leu His Arq 
1" 120 125 



Arg Ala Cys Asn Phe Val Ala Gin Ser His Ala Thr Gin Ala Leu Leu 

40 



26S Application of Royer. et al. 

Ala Arg Arg Ser Val Phe Thr Arg Gin Qly Ala Pro Leu Leu lie Thr 

150 155 160 



Glu Cys Met Leu Pro Ala Leu Trp Ala Thr Leu Glu Pro Val Ala Ala 
165 170 175 



Pro Arg Gin Ala Ser Leu Ser Ala Asp Gly Pro Cys Arg His Ser Ala 
180 185 190 



Gin He Val Ser Pro Glu Ser Met Leu Glu 
195 200 



<210> 46 
<211> 278 
<212> PRT 

<213> Xanthomonas albilineans 
<400> 46 

Met Pro Asn Ala Val Pro Met Gin Gly Ala Arg Gly Leu Pro Gin Pro 
Is 10 15 



Gin Ala Met Asn Pro Gly Leu Pro Ser Val Gly Gly Leu Ser Ala Gly 
20 25 30 



Gin Pro Leu Gin Leu Ser Leu Ala Pro Glu Leu Gin Ala Ala Ala Arg 
35 40 45 



I 

Ser Ala His Arg His Leu Leu Asp Asp Gly Thr Ala Leu Tyr Leu Leu 
50 55 60 



Ala Phe Asp Thr Ala Gin Phe Asp Pro Gly Ala Phe Ala Ala Met Ala 

70 75 80 



266 Application of Royer, et al. 



lie Ala Arg Pro Asp Ser He Ala Arg Ser Val Arg Lys Arg Gin Ala 
85 90 95 



Glu Phe Leu Phe Gly Arg Leu Ala Ala Arg Leu Ala Leu Gin Glu Val 
100 105 110 



Leu Gly Pro Ala Gin Ala Gin Ala Asp He Ala He Gly Ala Thr Arg 
115 120 125 



Ala Pro Cys Trp Pro Ala Gly Ser Leu Gly Ser He Ser His Cys Glu 
130 135 140 



Asp Tyr Ala Ala Ala He Ala Met Ala Ala Gly Thr Arg His Gly Val 

150 155 160 



Gly He Asp Leu Glu Arg Pro He Thr Pro Ala Ala Arg Ala Ala Leu 
1€5 170 175 



Leu Ser He Ala He Asp Ala Asp Glu Ala Ala Arg Leu Ala Lys Ala 
180 185 190 



Ala Asp Ala Gin Tip Pro Gin Asp Leu Leu Leu Thr Ala Leu Phe Ser 
195 200 205 



Ala Lys Glu Ser Leu Phe Lys Ala Ala Tyr Ser Ala Val Gly Arg Tyr 
210 215 220 



Phe Asp Phe Ser Ala Ala Arg Leu Cys Gly He Asp Leu Ala Arg Gin 
225 230 235 240 



267 Application of Royer, et al 

Cys Leu His Le\i Arg Leu Thr Glu Thr Leu Cys Ala Gin Phe Val Ala 
245 250 255 



Gly Gin Val Cys Glu Val Gly Phe Ala Arg Leu Pro Pro Asp Leu Val 
260 265 270 



Leu Thr His Tyr Ala Trp 
275 



<210> 47 
<211> 634 
<212> PRT 

<213> Xanthomonas albilineans 
<400> 47 

Met Ser Val Glu Thr Gin Lys Glu Thr Leu Gly Phe Gin Thr Glu Val 
^5 10 15 



Lys Gin Leu Leu Gin Leu Met lie His Ser Leu Tyr Ser Asn Lys Glu 
20 25 30 



lie Phe Leu Arg Glu Leu He Ser Asn Ala Ser Asp Ala Ala Asp Lys 
35 40 45 



Leu Arg Phe Glu Ala Leu Val Lys Pro Glu Leu Leu Asp Gly Asp Ala 
50 55 60 



Gin Leu Arg He Arg He Gly Phe Asp Lys Asp Ala Gly Thr Val Thr 
65 70 75 80 



He Asp Asp Asn Gly He Gly Met Ser Arg Glu Glu He Val Ala His 
85 90 95 



268 Application of Royer, et al. 



Leu Gly Thr lie Ala Lys Ser Gly Thr Ser Asp Phe Leu Lys His Leu 
100 105 110 



Ser Gly Asp Gin Lys Lys Asp Ser His Leu lie Gly Gin Phe Gly Val 
lis 120 125 



Gly Phe Tyr Ser Ala Phe He Val Ala Asp Gin Val Asp Val Tyr Ser 
130 135 140 



Arg Arg Ala Gly Leu Pro Ala Ser Asp Gly Val His Trp Ser Ser Arg 
1« 150 155 160 



Gly Glu Gly Glu Phe Glu Val Ala Thr He Asp Lys Pro Glu Arg Gly 
165 170 175 



Thr Arg He Val Leu His Leu Lys Glu Glu Glu Lys Gly Phe Ala Asp 
180 185 190 



Gly Trp Lys Leu Arg Ser He Val Arg Lys Tyr Ser Asp His He Ala 
195 200 205 



Leu Pro He Glu Leu He Lys Glu His Tyr Gly Glu Asp Lys Asp Lys 
210 215 220 



Pro Glu Thr Pro Glu Trp Glu Thr Val Asn Arg Ala Ser Ala Leu Trp 
225 230 235 240 



Thr Arg Pro Arg Thr Glu He Lys Asp Glu Glu Tyr Gin Glu Leu Tyr 
245 250 255 



269 Application of Royer, et a). 

Lys His He Ala His Asp His Glu Asn Pro Val Ala Trp Ser His Asn 
. 260 265 270 



5 Iiys Val Glu Gly Lys lieu Glu Tyr Thr Ser Zieu Iteu Tyr Leu Pro Gly 

275 280 285 



Arg Ala Pro Phe Asp Leu Tyr Gin Arg Asp Ala Ser Arg Gly Leu Lys 
10 290 295 300 



Leu Tyr Val Gin Arg Val Phe lie Met Asp Gin Ala Asp Gin Phe Leu 
305 310 315 320 

15 

Pro Leu Tyr Leu Arg Phe He Lys Gly He Val Asp Ser Ser Asp Leu 
325 330 335 

20 

Pro Leu Asn Val Ser Arg Glu He Leu Gin Ser Gly Pro Val He Asp 
340 345 350 



25 Ser Met Lys Ser Ala Leu Thr Lys Arg Ala Leu Asp Met Leu Glu Lys 

355 360 365 



Leu Ala Lys Asp Asp Pro Glu Arg Tyr Lys Gly Val Trp Lys Asn Phe 
30 370 375 380 



Gly Gin Val Leu Lys Glu Gly Pro Ala Gin Asp Phe Gly Asn Arg Glu 
385 390 395 400 



35 



Lys He Ala Gly Leu Leu Arg Phe Ala Ser Thr His Ser Gly Asp Asp 
405 4X0 415 



40 

Ala Gin Asn Val Ser Leu Ala Asp Tyr Val Ala Arg Met Lys Asp Gly 



10 



15 



25 



30 



35 



270 Application of Royer. et al. 

420 425 430 



Gin Asp Lys I,eu Tyr Tyr Leu Thr Gly Glu Ser Tyr Ala Gin lie Lys 
5 435 440 445 



Asp Ser Pro His Leu Glu Val Phe Arg Lys Lys Gly lie Glu Val Leu 
450 455 460 



Leu Leu Thr Asp Arg He Asp Glu Trp Leu Met Ser Tyr Leu Thr Glu 
4" 470 475 480 



Phe Asp Ser Lys Ser Phe Val Asp Val Ala Arg Gly Asp Leu Asp Leu 
485 490 495 



20 Gly Lys Leu Asp Ser Glu Glu Glu Lys Gin Ala Gin Glu Glu Ala Ala 

500 505 510 



Lys Ala Lys Gin Gly Leu Ala Glu Arg He Gin Gin Val Leu Lys Asp 
515 520 525 



Glu Val Ala Glu Val Arg Val Ser His Arg Leu Thr Asp Ser Pro Ala 
530 535 540 



He Leu Ala He Gly Gin Gly Asp Met Gly Leu Gin Met Arg Gin He 
545 550 555 S60 



Leu Glu Ala Ser Gly Gin Lys Leu Pro Glu Ser Lys Pro Val Phe Glu 
565. 570 575 



40 Phe Asn Pro Ala His Pro Leu He Glu Lys Leu Asp Ala Glu Pro Asp 

580 585 590 



271 Application of Royer, et al. 



Val Asp Arg Phe Gly Asp Leu Ala Arg Val Leu Phe Asp Gin Ala Ala 
595 600 605 



Leu Ala Ala Gly Asp Ser Leu Lys Asp Pro Ala Ala Tyr Val Arg Arg 

615 620 



Leu Asn Lys Leu Leu Leu Glu Leu Ser Ala 
625 630 



<210> 48 

<211> 20 

<212> DMA 

<213> Xanthomonas albilineans 



<400> 48 

gcgtaccgtt gtccagtagg 



20 



<210> 49 
<211> 20 
<212> DNA 

<213> Xanthomonas albilineans 
<400> 49 

gctggaaacc gagaatctga 2o 

<210> 50 
<211> 20 
<212> DIMA 

<213> Xanthomonas albilineans 
<400> 50 

gacacgatca gccgctagga 20 



<210> 51 



35 



272 



Application of Royer, et al. 



<211> 20 
<212> DNA 

<213> Xanthomonas alblllneans 



<400> 51 

accagcagtt gggccagcct: 



20 



<2I0> 52 

10 <211> 19 

<212> DNA 

<213> Xanthomonas albllineans 



<400> 52 
15 tgcccacagg ccgtcgagt 



19 



<210> 53 

<211> 20 

20 <212> DNA 

<213> Xanthomonas albllineans 



25 



<400> 53 

gcgagaggac aagctgctgc 



20 



30 



<210> 54 

<211> 20 

<212> DNA 

<213> Xanthomonas albllineans 



<400> 54 

cgttgaggat gcagcgctcg 



20 



40 



273 Application of Royer, et al. 



1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 



1 
2 



1 
1 



CLAIMS: 
We claim: 



2. 



4. 
5. 
6. 



DNA molecules encoding the Albicidin Biosynthetic Gene Clusters and 
proteins selected from the group consisting of: 

(a) isolated DNA fragments which encode proteins that in turn 
individuaUy and collectively perform fimctions in Albicidin Biosynthesis; 

(b) isolated DNA which hybridizes to isolated DNA of (a) above fmd that 
encodes a protein that in turn performs an individual function in Albicidin 
Biosynthesis; and 

(c) isolated DNA differing from the isolated DNAs of (a) and (b) above 
in codon sequence due to the degeneracy of the genetic code, and which encodes 
a protein that in turn performs as function in Albicidin Biosynthesis ' 

(d) isolated DNA selected from the group of DNA molecules having a 
sequence that is at least 70% homologous with a DNA comprising one or more 
ofSEQ, ID. Nos.l to 25. 

Isolated DNA molecules of claim 1 comprising any one of SEQ ID No. 1, SEQ 
ID No. 2 or SEQ ID No. 3. 

A vector comprising a purified and isolated DNA molecule(s) of claim I 
operably linked to promoters. 

A host cell comprising an isolated DNA molecule of claim 1. 
A host cell comprising the isolated DNA molecule of claim 2. 
A host cell comprising a vector of claim 3. 
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1 7. A method of producing a protein, wherein said protein consists of an amino 

2 acid sequence selected firom the group consisting of SEQ ID Nos. 26 to 48, 

3 comprising ttie steps of: expressing DNA molecules of Claim 1 in a host cell, 

4 wherein said DNA molecules encodes a protein, and wherein the expression of 

5 said DNA molecules leads to the production of Albicidins by said cell. 

1 8. A method of producing a polyketide carrying para-aminobenzoic acid and/or 

2 carbamoyl benzoic acid by inserting at least one DNA Fragment of Claun 1 

3 that encodes a PKS protein into a cell and causing the cell to express the 

4 encoded PKS protein under conditions such that the PKS protein functions to 

5 produce a polyketide carrying either a para-aminobenzoic acid or a carbamoyl 

6 benzoic acid or both. 

1 9. A method of producing polyketide/peptides carrymg para-aminobenzoic acid 

2 and/or carbamoyl benzoic acid by inserting at least one DNA Fragment of 

3 Claim 1 that encodes a PKS protein into a cell and causing the cell to express 

4 the encoded PKS protein under conditions such that the PKS protein functions 

5 to produce a polyketide carrying either a para-aminobenzoic acid or a 

6 carbamoyl benzoic acid or both. 

1 10. A method of activating nonproteinogenic amino acids like paraminobenzoic 

2 acid and/or carbamoyl benzoic acid for incorporation into peptides or 

3 polyketides by inserting at least one DNA Fragment of Claim 1 that encodes a 

4 PKS protein into a cell and causing the cell to express the encoded PKS protein 

5 imder conditions such that the PKS protein functions to produce a polyketide 

6 carrying either a para-aminobenzoic acid or a carbamoyl benzoic acid or both. 

1 11. Proteins encoded by the DNA of Claim 1 . 

1 12, Proteins encoded by the DNA of Claim 2. 

1 13. An isolated and purified antibiotic produced by a process that includes at least 

2 three proteins coded by DNA sequences of claim I in combination with 

3 additional enzymes that modify the product to provide a non-naturally 
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occurring Albicidin-like product having at least one of the useful properties 
reported for albicidin. 

An antibiotic or antibiotics of claim 13 having at least one of the general 
structures illustrated in Figure 11. 

An antibiotic produced by the process of expressing tfie DNA of one or niore 
of the genes included in the Albicidin Biosyndietic Gene Clusters of Claim 1 m 
a genetically modified host cell sustained in a culture media, and thereafter 
separating the antibiotic from the host cell and culture media. 

A process for producing an antibiotic that comprises modifying a host cell to 
enhance expression of the DNA of claim 1 by insertion of expression 
enhancing DNA into the genome of a Xanthomonas albilineans strain, 
Escherichia coli strain, or other Albicidin producing microbial strain, in a 
position operative to enhance expression of the enzymes of the Albicidin 
Biosynthetic Gene ClustCTS, culturmg the modified host cell to produce an 
antibiotic and isolating the antibiotic. 

An isolated purified antibiotic having at least 4 of the structural elements 
illustrated in Figure 1 1, and an elemental composition ofC^Yi^^f>^y 

A method of protecting a plant against damage from albicidin that comprises 
applying an agent that blocks expression at least one gene in the Albicidin 
Biosynthetic Gene Clusters of claim 1 to the plant to be protected. 

A method of obtaining agents useful m blocking expression of albicidin by 
screening materials against a modified host cell line that expresses the 
Albicidin Biosynthesis Gene Clusters of claim 1 and selecting for materials 
that stop or decrease albicidin production. 

A method of protecting a plant against phytotoxic damage from an antibiotic 
that comprises inserting into the plant and operably expressing at least one 
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resistance gene from the Albicidin Biosynthesis Gene Clusters of claim 1 in 
the plant to be protected. 

A plant reproductive part carrying an albicidin resistance gene of claim 1 
selected from the groiq> consisting of seeds, propagative materials and plant 
parts. 
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ABSTRACT 



Three gene clusteis that together encode albiddin biosynthesis, the complete gene DNA 
sequences, the deduced protein sequences for the enzymes and methods for using the DNA 
sequences are disclosed and discussed as well as methods for plant protection and creating new 
antibiotics. The novel Albicidin femily of antibiotics is disclosed and their structure deduced. 
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XALBl Strand + 

29 bp downstream from the TGA stop codon oialbXVII 

-40 -35 -25 -20 -15 -10 -5 -l^. +5 

17085=> ACCAiTOTGAACGGCCraCCCGCri^^ 



P s 
4.30 0 



XALBl Strand + ' ' " 

400 bp downstream from the TAA stop codon of alblV 

-40 -35 -30 -25 -20 -15 -10 -5 -1+ +5 
55617»> CATGGCTGCAGGCCGMCT^ ^„ 55667 



P S 
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XALBl Strand- " 

62 bp, 170 bp and 560 bp downstream from the TAG stop codon of albXVI 

-40 -35 -30 -25 -20 -15 -10 -5 -1+ +5 
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XALB3 Strand+ 



-40 -35 -30 -25 -20 -15 -10 -5 -1+ +5 
a065e> GCAAAGAAAAGCGGAAACGAAAAAAGGGCCTACGGGCCCTTTOTTCTTCCA 
8072=> AAAGCGGAAACGAJUUWU^GGGCCTACGGGCCCTTTTTTm 



P s 
4.78 0 
3.94 86 



Figure 6 
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